GraphQL¶
If you want to try out this notebook with a live Python kernel, use mybinder:
vaex-graphql is a plugin package that exposes a DataFrame via a GraphQL interface. This allows easy sharing of data or aggregations/statistics or machine learning models to frontends or other programs with a standard query languages.
(Install with $ pip install vaex-graphql
, no conda-forge support yet)
[3]:
import vaex
df = vaex.datasets.titanic()
df
[3]:
# | pclass | survived | name | sex | age | sibsp | parch | ticket | fare | cabin | embarked | boat | body | home_dest |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | True | Allen, Miss. Elisabeth Walton | female | 29.0 | 0 | 0 | 24160 | 211.3375 | B5 | S | 2 | nan | St Louis, MO |
1 | 1 | True | Allison, Master. Hudson Trevor | male | 0.9167 | 1 | 2 | 113781 | 151.55 | C22 C26 | S | 11 | nan | Montreal, PQ / Chesterville, ON |
2 | 1 | False | Allison, Miss. Helen Loraine | female | 2.0 | 1 | 2 | 113781 | 151.55 | C22 C26 | S | None | nan | Montreal, PQ / Chesterville, ON |
3 | 1 | False | Allison, Mr. Hudson Joshua Creighton | male | 30.0 | 1 | 2 | 113781 | 151.55 | C22 C26 | S | None | 135.0 | Montreal, PQ / Chesterville, ON |
4 | 1 | False | Allison, Mrs. Hudson J C (Bessie Waldo Daniels) | female | 25.0 | 1 | 2 | 113781 | 151.55 | C22 C26 | S | None | nan | Montreal, PQ / Chesterville, ON |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1,304 | 3 | False | Zabour, Miss. Hileni | female | 14.5 | 1 | 0 | 2665 | 14.4542 | None | C | None | 328.0 | None |
1,305 | 3 | False | Zabour, Miss. Thamine | female | nan | 1 | 0 | 2665 | 14.4542 | None | C | None | nan | None |
1,306 | 3 | False | Zakarian, Mr. Mapriededer | male | 26.5 | 0 | 0 | 2656 | 7.225 | None | C | None | 304.0 | None |
1,307 | 3 | False | Zakarian, Mr. Ortin | male | 27.0 | 0 | 0 | 2670 | 7.225 | None | C | None | nan | None |
1,308 | 3 | False | Zimmerman, Mr. Leo | male | 29.0 | 0 | 0 | 315082 | 7.875 | None | S | None | nan | None |
[10]:
result = df.graphql.execute("""
{
df {
min {
age
fare
}
mean {
age
fare
}
max {
age
fare
}
groupby {
sex {
count
mean {
age
}
}
}
}
}
""")
result.data
[10]:
OrderedDict([('df',
OrderedDict([('min',
OrderedDict([('age', 0.1667), ('fare', 0.0)])),
('mean',
OrderedDict([('age', 29.8811345124283),
('fare', 33.29547928134572)])),
('max',
OrderedDict([('age', 80.0), ('fare', 512.3292)])),
('groupby',
OrderedDict([('sex',
OrderedDict([('count', [466, 843]),
('mean',
OrderedDict([('age',
[28.6870706185567,
30.585232978723408])]))]))]))]))])
Pandas support¶
After importing vaex.graphql, vaex also installs a pandas accessor, so it is also accessible for Pandas DataFrames.
[11]:
df_pandas = df.to_pandas_df()
[20]:
df_pandas.graphql.execute("""
{
df(where: {age: {_gt: 20}}) {
row(offset: 3, limit: 2) {
name
survived
}
}
}
"""
).data
[20]:
OrderedDict([('df',
OrderedDict([('row',
[OrderedDict([('name', 'Anderson, Mr. Harry'),
('survived', True)]),
OrderedDict([('name',
'Andrews, Miss. Kornelia Theodosia'),
('survived', True)])])]))])
Server¶
The easiest way to learn to use the GraphQL language/vaex interface is to launch a server, and play with the GraphiQL graphical interface, its autocomplete, and the schema explorer.
We try to stay close to the Hasura API: https://docs.hasura.io/1.0/graphql/manual/api-reference/graphql-api/query.html
A server can be started from the command line:
$ python -m vaex.graphql myfile.hdf5
Or from within Python using df.graphql.serve
GraphiQL¶
See https://github.com/mariobuikhuizen/ipygraphql for a graphical widget, or a mybinder to try out a live example.
[ ]: