You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by ko...@apache.org on 2023/04/05 07:14:40 UTC
[arrow-site] branch main updated: Add Hugging Face Datasets to powered_by.md (#341)
This is an automated email from the ASF dual-hosted git repository.
kou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-site.git
The following commit(s) were added to refs/heads/main by this push:
new 4768c3c0c07 Add Hugging Face Datasets to powered_by.md (#341)
4768c3c0c07 is described below
commit 4768c3c0c07103155759f29652b73a3b290dfa3d
Author: Christopher Akiki <ch...@gmail.com>
AuthorDate: Wed Apr 5 09:14:34 2023 +0200
Add Hugging Face Datasets to powered_by.md (#341)
This adds HF `datasets` to the list of projects using Arrow.
(https://huggingface.co/docs/datasets/about_arrow)
---
powered_by.md | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/powered_by.md b/powered_by.md
index ec368907654..fad37c2a676 100644
--- a/powered_by.md
+++ b/powered_by.md
@@ -123,7 +123,11 @@ short description of your use case.
* **[HASH][39]:** HASH is an open-core platform for building, running, and learning
from simulations, with an in-browser IDE. HASH Engine uses Apache Arrow to power
the datastore for simulation state during computation, enabling zero-copy data
- transfer between simulation logic written across Rust, JavaScript, and Python.
+* **[Hugging Face Datasets][47]:** A machine learning datasets library and hub
+ for accessing, processing and sharing datasets for audio, computer vision,
+ natural language processing, and tabular tasks. Dataset objects are wrappers around
+ Arrow Tables and memory-mapped from disk to support out-of-core parallel processing
+ for machine learning workflows.
* **[InAccel][29]:** A machine learning acceleration framework which leverages
FPGAs-as-a-service. InAccel supports dataframes backed by Apache Arrow to
serve as input for our implemented ML algorithms. Those dataframes can be
@@ -248,3 +252,4 @@ short description of your use case.
[44]: https://clickhouse.com/docs/en/interfaces/formats/#data-format-arrow
[45]: https://unum.cloud/ukv/
[46]: https://github.com/GrepTimeTeam/greptimedb/
+[47]: https://github.com/huggingface/datasets