big-data
Here are 2,109 public repositories matching this topic...
-
Updated
Jul 10, 2020
I was going though the existing enhancement issues again and though it'd be nice to collect ideas for spaCy plugins and related projects. There are always people in the community who are looking for new things to build, so here's some inspiration
If you have questions about the projects I suggested,
I've noticed all the links on template homepages are broken.
I cant find where these links are set though. It doesnt seem to be here
<img width="1043" alt="sc
I'm looking at the react tutorial at https://github.com/amark/gun/wiki/React-Tutorial and noticed that the code examples are using depreciated lifecycle methods such as componentWillMount.
Use case
ClickHouse/ClickHouse#7971
Currently array_position only returns the first occurrence of the given element. We want to extend array_position to take an additional parameter instance similar to strpos.
array_position(x, element, instance) -> bigint
For example, for array x: [1, 3, 2, 1, 4, 3, 2, 5, 4, 1]:
array_position(x, 1) = 1 -- existing function
array_position(x, 1, 1) = 1 -- same as exist
I am trying to deploy the app with the given ./sbt clean dist command but I got this error:
Downloading sbt launcher for 1.3.8:
From https://repo.scala-sbt.org/scalasbt/maven-releases/org/scala-sbt/sbt-launch/1.3.8/sbt-launch-1.3.8.jar
To /root/.sbt/launchers/1.3.8/sbt-launch.jar
Downloading sbt launcher 1.3.8 md5 hash:
From https://repo.scala-sbt.org/scalasbt/maven-releas
Stefan Behnel wrote:
No. "@cython.cfunc" declares a function or method as a pure C function,
without a Python interface to it, and for methods, it only applies to
extension types and not regular Python classes.It's interesting that Cython allowed you to set it on the "
__iter__" method
which cannot, in fact, be a C method because it's one of Python's special
methods. We s
This class could be used instead of cd file https://catboost.ai/docs/concepts/input-data_column-descfile.html when creating Pool from filez. The class should have init function, methods load and save, and Pool init method should be able to use object of this class instead of cd file during initialization.
-
Updated
Jul 14, 2020 - Jupyter Notebook
Summary
CouchDB keeps a list of purge infos to ensure that purges can be applied on a cluster without purged documents being re-introduced by internal replication.
It would be useful to make this list available for replication clients like PouchDB, who then could apply local purges on their own. I know PouchDB doesn’t implement purge just yet, but it’s something that folks will need befor
Based on this: https://pachyderm-users.slack.com/archives/GJSKMC0F6/p1588199373044100
Add a list of ports that need to be exposed to the outside world to make Pachyderm accessible.
With the addition of OIDC support in 1.11 another port will be likely be added to this list.
The docs have a great intro that explains the technology buildup to arrive at inventing stream but then it stops without explaining how stream uses Cassandra + Redis (plus celery message queue?) to solve this problem. (For all I know it doesn't.)
As a developer, a quick explanation of how this framework solves the
I would like to point out that identifiers like “_DLL_HEADER” and “_MOLOCH_LUA_” [do eventually not fit](https://www.securecoding.cert.org/conf
In the Cloud-native K8S environment, the logging architecture almost always assumes that all needed logs are sent to the stdout. It works as a unified source of logs where different tools read them, re-organize if needed, and route to the destinations like Analytics Dashboards etc.
Hazelcast Diagnostics are very useful when troubleshooting the performance and stability issues but currently, it
Spark 2.3 officially support run on kubernetes. While our guide of "Run on Kubernetes" is still based on a special version of Spark 2.2, which is out of date. We need to:
- update that document to Spark 2.3
- release the corresponding docker images.
-
Updated
Jul 14, 2020 - Java
-
Updated
Jul 14, 2020 - Java
Hello Vespa Team,
Can you please consider support a properties-file which is available during run-time along with the model ? So some meta-data e.g. threshold/label, etc can be associated with the model.
This is for the stateless evaluation of the models like (XGBoost, TensorFlow, Onnx, etc) which is supported in vespa.
Thank you,
Pinank
I have noticed a small error in the documentation around S3 configurations:
https://docs.delta.io/latest/delta-storage.html#amazon-s3
On the read part, it should be load and not save:
spark.read.format("delta").load("s3a://<your-s3-bucket>/<path>/<to>/<delta-table>")
Also, I have successfully tested Delta 0.5.0 with on-premise S3 - https://min.io
There were some quirks around the
数据源对接jdbc presto,presto sql不复杂,但一个任务对接多个看板,返回数据 是null,查看 代码:
synchronized (context) {
context.wait(10 * 60 * 1000);
}
唤醒代码:
synchronized (context) {
context.setData(data);
context.notify();
}
返回的data=null,
我将自动等待时长加到30 * 60 * 1000,错误还是一样
-
Updated
Mar 14, 2017 - Python
Improve this page
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."


Alexnet implementation in tensorflow has incomplete architecture where 2 convolution neural layers are missing. This issue is in reference to the python notebook mentioned below.
https://github.com/donnemartin/data-science-ipython-notebooks/blob/master/deep-learning/tensor-flow-examples/notebooks/3_neural_networks/alexnet.ipynb