dbfp_pub/docs/README_DEV_INDEX

61 lines
2.1 KiB
Plaintext

=====================
Fingerprint Index
=====================
-------------
INTRO
-------------
Why:
The purpose of the database fingerprint index is for speed and quick statistics. The
current number of fingerprints that I have created is ~ 180. In the future it likely
that our fingerprints will be > 1000. The index is designed for the future.
Where:
A sqlite database (_index_dbfp.db) is populated with index data. The current design
expects the index file to be located in the same directory as all the fingerprints.
The fingerprint names are created uniquely and should never have a collision.
How:
To create the index each fingerprint is read and the unique hash values are inserted
into the index database along with the fingerprint file name. Each fingerprint has
an md5 hash that represent the entire database along with a md5 hash that represents
each table in the database. These md5 hashes are used as unique keys that can be
queried in the fingerprint index.
----------
DESIGN
----------
Each create statement can be unique because of the various styles allowed, syntax
The result from the create statements are the same...
The create statments are md5 hashed, those md5 hashes are hashed for db_md5
-------------
DB SCHEMA
-------------
[ Table: md5_all ]
md5_db TEXT PRIMARY KEY, (hash value of the database schema)
md5_list TEXT, (CSV list of md5 hash of tables within the database)
fp_list TEXT, (CSV list of fingerprint file names)
fp_count INTEGER); (count of the fingerprints)
[ Table: md5_table ]
md5_table TEXT PRIMARY KEY, (hash value of the table schema)
fp_list TEXT, (CVS list of fingerprint file names)
fp_count INTEGER); (count of the fingerprints)
*
* The md5_db is not a primary key, is not unique because it is for each app info
*
[ Table: metadata ]
md5_db TEXT, (hash value of the database schema)
app_name TEXT, (name of the app)
app_ver TEXT, (version of the app)
db_file TEXT, (file name of the database scanned)
fp_file TEXT, (file name of the fingerprint)
scan_date TEXT); (date the db was scanned)