The Wayback Machine - https://web.archive.org/web/20200825030844/https://github.com/pydata/xarray/issues/4228
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_dataframe: no valid index for a 0-dimensional object #4228

Open
ghislainp opened this issue Jul 15, 2020 · 3 comments
Open

to_dataframe: no valid index for a 0-dimensional object #4228

ghislainp opened this issue Jul 15, 2020 · 3 comments

Comments

@ghislainp
Copy link

@ghislainp ghislainp commented Jul 15, 2020

What happened:
xr.DataArray([1], coords=[('onecoord', [2])]).sel(onecoord=2).to_dataframe(name='name') raise an exception ValueError: no valid index for a 0-dimensional object

What you expected to happen:

the same behavior as: xr.DataArray([1], coords=[('onecoord', [2])]).to_dataframe(name='name')

Anything else we need to know?:

I see that the array after the selection has no "dims" anymore, and this is what cause the error. but it still has one "coords", this is confusing. Is there any documentation about this difference ?

Environment:

INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jun 1 2020, 18:57:50) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 4.19.0-9-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4

xarray: 0.15.1
pandas: 1.0.4
numpy: 1.18.5
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: 2.4.0
cftime: 1.1.3
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.18.1
distributed: 2.18.0
matplotlib: 3.2.1
cartopy: None
seaborn: 0.10.1
numbagg: None
setuptools: 47.3.1.post20200616
pip: 20.1.1
conda: 4.8.3
pytest: 5.4.3
IPython: 7.15.0
sphinx: 3.1.1

@dcherian
Copy link
Contributor

@dcherian dcherian commented Jul 15, 2020

You need

xr.DataArray([1], coords=[('onecoord', [2])]).sel(onecoord=[2]).to_dataframe(name='name')

The difference is using onecoord=2 gives a scalar

>>> xr.DataArray([1], coords=[('onecoord', [2])]).sel(onecoord=2)
<xarray.DataArray ()>
array(1)
Coordinates:
    onecoord  int64 2

while using onecoord=[2] gives a 1 element vector

>>> xr.DataArray([1], coords=[('onecoord', [2])]).sel(onecoord=[2])
<xarray.DataArray (onecoord: 1)>
array([1])
Coordinates:
  * onecoord  (onecoord) int64 2

And to_dataframe cannot handle scalars.

I am not sure that there is a sensible way to convert a scalar DataArray to a DataFrame but we should throw a more informative error in any case.

@ghislainp
Copy link
Author

@ghislainp ghislainp commented Jul 15, 2020

thanks for the very clear response. The behaviro make sense.

In fact, I should have explained what I'm trying to achieve, as this is kind of "take". I've a dict like this:

{'label1' : dict(coord1=1, coord2=4), 
'label2' : dict(coord1=5, coord2=6),
'label3' : dict(coord1=4, coord2=2),
}

and I want to build an xarray (and then a dataframe) with coord1 and coord2 replaced by a new dims with values 'label1', 'label2', 'label3'.

I've done that by iterating over the dict, selecting with sel using the dict values, convert to dataframe and then concat the dataframes. pd.concat([x.sel(**d[k]).to_dataframe() or k in d]

A better option would be to do this "sel" or "take" with xarray only.
Do you have an idea how to do it with existing xarray methods?

@dcherian
Copy link
Contributor

@dcherian dcherian commented Jul 15, 2020

You could do it with "advanced indexing" by providing a dataarray to the .sel or .isel methods: https://xarray.pydata.org/en/stable/indexing.html#more-advanced-indexing

da = xr.DataArray([[1, 2, 3], [4,5,6]], dims=["coord1", "coord2"], coords={"coord2": [10, 20, 30], "coord1": [1,2]}) 

i1 = xr.DataArray([1, 0], dims=["z"], coords={"z": ["label1", "label2"]})
i2 = xr.DataArray([2, 1], dims=["z"], coords={"z": ["label1", "label2"]})  

da.isel(coord1=i1, coord2=i2, drop=True).to_dataframe(name="asd")
        asd
z          
label1    6
label2    2
@dcherian dcherian changed the title no valid index for a 0-dimensional object to_dataframe: no valid index for a 0-dimensional object Jul 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.