You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by al...@apache.org on 2021/08/01 10:36:36 UTC

[arrow-datafusion] branch master updated: expand file glob within prettier (#803)

This is an automated email from the ASF dual-hosted git repository.

alamb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git


The following commit(s) were added to refs/heads/master by this push:
     new 3eac2e6  expand file glob within prettier (#803)
3eac2e6 is described below

commit 3eac2e65437de52a26d2380a7d49fbcea9eb2c15
Author: QP Hou <qp...@scribd.com>
AuthorDate: Sun Aug 1 03:36:32 2021 -0700

    expand file glob within prettier (#803)
    
    '**' pattern is not supported to some of the shells including the one we
    use in CI.
---
 .github/workflows/dev.yml                         |  4 +--
 ballista/README.md                                | 20 +++++++--------
 docs/user-guide/src/distributed/docker-compose.md |  2 +-
 docs/user-guide/src/distributed/kubernetes.md     | 30 +++++++++++------------
 4 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/.github/workflows/dev.yml b/.github/workflows/dev.yml
index 8bb35f1..39c449c 100644
--- a/.github/workflows/dev.yml
+++ b/.github/workflows/dev.yml
@@ -64,7 +64,7 @@ jobs:
           # if you encounter error, try rerun the command below with --write instead of --check
           # and commit the changes
           npx prettier@2.3.2 --check \
-            {ballista,datafusion,datafusion-examples,docs,python}/**/*.md \
+            '{ballista,datafusion,datafusion-examples,docs,python}/**/*.md' \
             README.md \
             DEVELOPERS.md \
-            ballista/**/*.{ts,tsx}
+            'ballista/**/*.{ts,tsx}'
diff --git a/ballista/README.md b/ballista/README.md
index 0a8db63..eeb4273 100644
--- a/ballista/README.md
+++ b/ballista/README.md
@@ -19,8 +19,8 @@
 
 # Ballista: Distributed Compute with Apache Arrow and DataFusion
 
-Ballista is a distributed compute platform primarily implemented in Rust, and powered by Apache Arrow and 
-DataFusion. It is built on an architecture that allows other programming languages (such as Python, C++, and 
+Ballista is a distributed compute platform primarily implemented in Rust, and powered by Apache Arrow and
+DataFusion. It is built on an architecture that allows other programming languages (such as Python, C++, and
 Java) to be supported as first-class citizens without paying a penalty for serialization costs.
 
 The foundational technologies in Ballista are:
@@ -37,23 +37,23 @@ redundancy in the case of a scheduler failing.
 
 # Getting Started
 
-Fully working examples are available. Refer to the [Ballista Examples README](../ballista-examples/README.md) for 
+Fully working examples are available. Refer to the [Ballista Examples README](../ballista-examples/README.md) for
 more information.
 
 ## Distributed Scheduler Overview
 
-Ballista uses the DataFusion query execution framework to create a physical plan and then transforms it into a 
+Ballista uses the DataFusion query execution framework to create a physical plan and then transforms it into a
 distributed physical plan by breaking the query down into stages whenever the partitioning scheme changes.
 
-Specifically, any `RepartitionExec` operator is replaced with an `UnresolvedShuffleExec` and the child operator 
+Specifically, any `RepartitionExec` operator is replaced with an `UnresolvedShuffleExec` and the child operator
 of the repartition operator is wrapped in a `ShuffleWriterExec` operator and scheduled for execution.
 
-Each executor polls the scheduler for the next task to run. Tasks are currently always `ShuffleWriterExec` operators 
-and each task represents one *input* partition that will be executed. The resulting batches are repartitioned 
-according to the shuffle partitioning scheme and each *output* partition is streamed to disk in Arrow IPC format.
+Each executor polls the scheduler for the next task to run. Tasks are currently always `ShuffleWriterExec` operators
+and each task represents one _input_ partition that will be executed. The resulting batches are repartitioned
+according to the shuffle partitioning scheme and each _output_ partition is streamed to disk in Arrow IPC format.
 
-The scheduler will replace `UnresolvedShuffleExec` operators with `ShuffleReaderExec` operators once all shuffle 
-tasks have completed. The `ShuffleReaderExec` operator connects to other executors as required using the Flight 
+The scheduler will replace `UnresolvedShuffleExec` operators with `ShuffleReaderExec` operators once all shuffle
+tasks have completed. The `ShuffleReaderExec` operator connects to other executors as required using the Flight
 interface, and streams the shuffle IPC files.
 
 # How does this compare to Apache Spark?
diff --git a/docs/user-guide/src/distributed/docker-compose.md b/docs/user-guide/src/distributed/docker-compose.md
index 14989e5..9ada1ba 100644
--- a/docs/user-guide/src/distributed/docker-compose.md
+++ b/docs/user-guide/src/distributed/docker-compose.md
@@ -24,7 +24,7 @@ demonstrates how to start a cluster using a single process that acts as both a s
 volume mounted into the container so that Ballista can access the host file system.
 
 ```yaml
-version: '2.2'
+version: "2.2"
 services:
   etcd:
     image: quay.io/coreos/etcd:v3.4.9
diff --git a/docs/user-guide/src/distributed/kubernetes.md b/docs/user-guide/src/distributed/kubernetes.md
index 4b80d17..ef4acca 100644
--- a/docs/user-guide/src/distributed/kubernetes.md
+++ b/docs/user-guide/src/distributed/kubernetes.md
@@ -129,16 +129,16 @@ spec:
         ballista-cluster: ballista
     spec:
       containers:
-      - name: ballista-scheduler
-        image: <your-image>
-        command: ["/scheduler"]
-        args: ["--bind-port=50050"]
-        ports:
-          - containerPort: 50050
-            name: flight
-        volumeMounts:
-          - mountPath: /mnt
-            name: data
+        - name: ballista-scheduler
+          image: <your-image>
+          command: ["/scheduler"]
+          args: ["--bind-port=50050"]
+          ports:
+            - containerPort: 50050
+              name: flight
+          volumeMounts:
+            - mountPath: /mnt
+              name: data
       volumes:
         - name: data
           persistentVolumeClaim:
@@ -245,10 +245,10 @@ spec:
   minReplicaCount: 0
   maxReplicaCount: 5
   triggers:
-  - type: external
-    metadata:
-      # Change this DNS if the scheduler isn't deployed in the "default" namespace
-      scalerAddress: ballista-scheduler.default.svc.cluster.local:50050
+    - type: external
+      metadata:
+        # Change this DNS if the scheduler isn't deployed in the "default" namespace
+        scalerAddress: ballista-scheduler.default.svc.cluster.local:50050
 ```
 
 And then deploy it into the cluster:
@@ -261,4 +261,4 @@ If the cluster is inactive, Keda will now scale the number of executors down to
 you launch a query. Please note that Keda will perform a scan once every 30 seconds, so it might take a bit to
 scale the executors.
 
-Please visit Keda's [documentation page](https://keda.sh/docs/2.3/concepts/scaling-deployments/) for more information.
\ No newline at end of file
+Please visit Keda's [documentation page](https://keda.sh/docs/2.3/concepts/scaling-deployments/) for more information.