You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nlpcraft.apache.org by "Gleb (Jira)" <ji...@apache.org> on 2020/06/07 22:32:00 UTC

[jira] [Comment Edited] (NLPCRAFT-67) Python machine learning module

    [ https://issues.apache.org/jira/browse/NLPCRAFT-67?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127793#comment-17127793 ] 

Gleb edited comment on NLPCRAFT-67 at 6/7/20, 10:31 PM:
--------------------------------------------------------

Batching is finished. Current benchmarks are (done on my PC: AMD Ryzen 9 3900X 12-Core Processor; GeForce RTX 2070 SUPER):

N      | CUDA | CPU
 1       | 58       | 141
 5       | 79       | 479
 100   | 500     | 3300
 1000 | 5000   | 33000

Where N is number of sentences in batches, number is CUDA and CPU columns are time in milliseconds. I believe that for big batches further optimization could be done (CUDA), where batch processing by Bert model is small percentage. It seems that big batches could not be processed fast on CPU. For example, out of 33 seconds of 1000 words processing on CPU, 30 seconds was spent on forwarding Bert (computing weights).


was (Author: ifropc):
Batching is finished. Current benchmarks are (done on my PC):

N      | CUDA | CPU
1       | 58       | 141
5       | 79       | 479
100   | 500     | 3300
1000 | 5000   | 33000

Where N is number of sentences in batches, number is CUDA and CPU columns are time in milliseconds. I believe that for big batches further optimization could be done (CUDA), where batch processing by Bert model is small percentage. It seems that big batches could not be processed fast on CPU. For example, out of 33 seconds of 1000 words processing on CPU, 30 seconds was spent on forwarding Bert (computing weights).

> Python machine learning module 
> -------------------------------
>
>                 Key: NLPCRAFT-67
>                 URL: https://issues.apache.org/jira/browse/NLPCRAFT-67
>             Project: NLPCraft
>          Issue Type: New Feature
>            Reporter: Gleb
>            Assignee: Gleb
>            Priority: Major
>             Fix For: 0.7.0
>
>
> Python part which consists of Bert masked word prediction and FastText synonyms filter



--
This message was sent by Atlassian Jira
(v8.3.4#803005)