You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Gregory Lepore (Jira)" <ji...@apache.org> on 2023/06/15 12:04:00 UTC

[jira] [Updated] (TIKA-4083) Add magic for ClamAV CDiff files

     [ https://issues.apache.org/jira/browse/TIKA-4083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gregory Lepore updated TIKA-4083:
---------------------------------
    Description: 
The ClamAV CDIFF format appears 1,582 times in the latest Common Crawl dataset. No known mime type.

The magic is 436C616D41562D44696666 at offset 0 (ClamAV-Diff in ASCII).

File extension is .cdiff.

 

[https://blog.clamav.net/2021/03/clamav-cvds-cdiffs-and-magic-behind.html]

  was:
The ClamAV CDIFF format appears 1,582 times in the latest Common Crawl dataset. No known mime type.

The magic is 436C616D41562D44696666 at offset 0 (ClamAV-Diff in ASCII).

 

https://blog.clamav.net/2021/03/clamav-cvds-cdiffs-and-magic-behind.html


> Add magic for ClamAV CDiff files
> --------------------------------
>
>                 Key: TIKA-4083
>                 URL: https://issues.apache.org/jira/browse/TIKA-4083
>             Project: Tika
>          Issue Type: Sub-task
>            Reporter: Gregory Lepore
>            Priority: Minor
>         Attachments: 0a0f28d9d03c84aaa97a996719c97663c1de40b7d0d710140fd47676a89cfcaa, 0a55b4b748ff9d0f542f2a8fb4ee9462d7ff299063a5ff9c653b74882a510d35, 0a8b3069b57c0069d99149c5296fc001d9fa8c8f88e0f865940dccc99c1af5c1
>
>
> The ClamAV CDIFF format appears 1,582 times in the latest Common Crawl dataset. No known mime type.
> The magic is 436C616D41562D44696666 at offset 0 (ClamAV-Diff in ASCII).
> File extension is .cdiff.
>  
> [https://blog.clamav.net/2021/03/clamav-cvds-cdiffs-and-magic-behind.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)