apache-spark
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Here are 954 public repositories matching this topic...
-
Updated
May 26, 2019 - Scala
-
Updated
Jun 2, 2020 - JavaScript
-
Updated
Nov 15, 2020 - Jupyter Notebook
-
Updated
Nov 11, 2020 - Java
This is more a question than a feature request.
When parsing JSON files, I need to sanitize the field names so field with spaces becomes field_with_spaces.
I want to preserve the original name as well, metadata about the column if you like :)
There is a metadata field on StructField, but it is internal.
Why is this internal, is it possible or desirable to expose it?
-
Updated
Nov 2, 2020 - Dockerfile
-
Updated
Nov 15, 2020 - Go
-
Updated
Oct 24, 2017 - Python
-
Updated
Dec 3, 2019 - Python
-
Updated
Nov 11, 2020
-
Updated
Oct 13, 2020 - C#
-
Updated
Nov 14, 2020 - R
-
Updated
Mar 9, 2020 - Python
-
Updated
Jan 24, 2017 - Scala
-
Updated
Jul 25, 2018 - Python
-
Updated
Jan 8, 2020 - Scala
-
Updated
Nov 2, 2020 - Python
-
Updated
Mar 31, 2018
-
Updated
Nov 8, 2020 - Java
-
Updated
Jul 29, 2020 - Jupyter Notebook
-
Updated
Oct 19, 2020
-
Updated
Nov 13, 2020
-
Updated
Oct 14, 2020 - Scala
-
Updated
Sep 14, 2015 - Shell
-
Updated
Jul 1, 2020 - Python
-
Updated
Jun 6, 2017
Created by Matei Zaharia
Released May 26, 2014
- Repository
- apache/spark
- Website
- spark.apache.org
- Wikipedia
- Wikipedia


MLflow seems to have a length limit of 5000 when setting tags (see below).