Published June 3, 2007 | https://doi.org/10.59350/t125s-b8827

Finding email with Strigi in .tar backups

Creators & Contributors

Now that my CUBIC desktop machine is shutting down, I made the necessary backups, among a mail.tar for my mail correspondence of about a year. About 500MB in size for almost 8700 files. Strigi is a perfect tool to help me find messages in this archive, as it will recurse into the .tar archive, and even into email attachements. I created an index just for the archive with:

strigicmd create -t clucene -d index/ mail.tar

It took Strigi about 30 seconds to index the whole archive. That's good performance!

Now, Strigi indexes content full text, but also uses a controlled vocabulary (among which one specifically for chemistry). So I can search for email messages which have article in the subject with:

strigicmd query -t clucene -d index/ email.subject:article

However, From: and To: content was not yet extracted. That was easily patched. This allows me to find correspondence between me and, for example, Christoph:

strigicmd query -t clucene -d index/ email.to:Christoph AND email.from:Egon

Additional details

Description

Now that my CUBIC desktop machine is shutting down, I made the necessary backups, among a mail.tar for my mail correspondence of about a year. About 500MB in size for almost 8700 files. Strigi is a perfect tool to help me find messages in this archive, as it will recurse into the .tar archive, and even into email attachements.

Identifiers

UUID
01dce11c-51cc-4ede-a385-a784a2bc5b19
GUID
https://doi.org/10.59350/t125s-b8827
URL
https://chem-bla-ics.linkedchemistry.info/2007/06/03/finding-email-with-strigi-in-tar.html

Dates

Issued
2007-06-03T00:00:00
Updated
2025-02-15T00:00:00