You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Tanya Schlusser (JIRA)" <ji...@apache.org> on 2019/02/07 22:46:00 UTC

[jira] [Commented] (ARROW-4313) Define general benchmark database schema

    [ https://issues.apache.org/jira/browse/ARROW-4313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763145#comment-16763145 ] 

Tanya Schlusser commented on ARROW-4313:
----------------------------------------

Thank you Antoine! I missed this last comment. "actual frequency" is a good name, and I used it.
 * I did not understand the conversations about little-and-big-endian, and did not add fields related to that to the database.
 * I was surprised during testing about the behavior of nulls in the database, so some things don't yet work the way I'd like (the example script fails in one place.)

Thank you everyone for so much feedback. I have uploaded new files for the current data model and am happy to change things according to feedback. If you don't like something, it can be fixed :) 

> Define general benchmark database schema
> ----------------------------------------
>
>                 Key: ARROW-4313
>                 URL: https://issues.apache.org/jira/browse/ARROW-4313
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Benchmarking
>            Reporter: Wes McKinney
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.13.0
>
>         Attachments: benchmark-data-model.erdplus, benchmark-data-model.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Some possible attributes that the benchmark database should track, to permit heterogeneity of hardware and programming languages
> * Timestamp of benchmark run
> * Git commit hash of codebase
> * Machine unique name (sort of the "user id")
> * CPU identification for machine, and clock frequency (in case of overclocking)
> * CPU cache sizes (L1/L2/L3)
> * Whether or not CPU throttling is enabled (if it can be easily determined)
> * RAM size
> * GPU identification (if any)
> * Benchmark unique name
> * Programming language(s) associated with benchmark (e.g. a benchmark
> may involve both C++ and Python)
> * Benchmark time, plus mean and standard deviation if available, else NULL
> see discussion on mailing list https://lists.apache.org/thread.html/278e573445c83bbd8ee66474b9356c5291a16f6b6eca11dbbe4b473a@%3Cdev.arrow.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)