You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@nifi.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/11/01 21:25:00 UTC

[jira] [Commented] (NIFI-12240) Add Python processors that are capable of interacting with vector stores

    [ https://issues.apache.org/jira/browse/NIFI-12240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781897#comment-17781897 ] 

ASF subversion and git services commented on NIFI-12240:
--------------------------------------------------------

Commit 5bcad9eef33d665c5b3a4e13d17bf625200d53df in nifi's branch refs/heads/main from Mark Payne
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=5bcad9eef3 ]

NIFI-12240 Added Python Processors for Docs, ChatGPT, Chroma, and Pinecone

Created new python processors for text embeddings, inserting into Chroma, querying Chroma, querying ChatGPT, inserting into and querying Pinecone. Fixed some bugs in the Python framework. Added Python extensions to assembly. Also added ability to load dependencies from a requirements.txt as that was important for making the different vectorstore implementations play more nicely together.

Excluded nifi-python-extensions-bundle from GitHub build because it requires Maven to use unpack-resources goal, which will not work in GitHub because it uses mvn compile instead of mvn install

- ParseDocument
- ChunkDocument
- PromptChatGPT
- PutChroma
- PutPinecone
- QueryChroma
- QueryPinecone

NIFI-12195 Added support for requirements.txt to define Python dependencies

This closes #7894

Signed-off-by: David Handermann <ex...@apache.org>


> Add Python processors that are capable of interacting with vector stores
> ------------------------------------------------------------------------
>
>                 Key: NIFI-12240
>                 URL: https://issues.apache.org/jira/browse/NIFI-12240
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>             Fix For: 2.0.0
>
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> There are many different vector stores these days. We should build processors that are capable of ingesting unstructured text, chunking it, and ingesting into at least Pinecone and Chroma. We should also have the ability to query the vector stores.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)