# Btrieve File Format MajorBBS/Worldgroup make use of Actian/Pervasive/Novell Btrieve as a native, persistent database platform. In this article we'll be discussing the Btrieve File Format as reverse engineered in the course of developing MBBSEmu. There is also an excellent book titled [Btrieve Complete, by Jim Kyle](https://books.google.com/books/about/Btrieve_Complete.html?id=_Vu8BaXMr4UC) which describes the file format is painstaking detail. # Assumptions * MajorBBS/Worldgroup used a version of Btrieve <= 6.0 * MajorBBS/Worldgroup did not make use of Compressed Fields # Basic File Structure Btrieve uses a Page-based file structure where data is segmented into pre-defined blocks of data known as "Pages". The total size of a Btrieve file can be defined as `Number of Pages * Page Size` The first Page in a Btrieve file is known as the `File Control Record` (FCR). The FCR contains meta-data related to the Btrieve file to give the engine information on the contents, its location and what to expect. The FCR always starts at byte `0` # FCR Information _(offsets relative to 0x0)_ ### File Meta-Data Name | Offset | Data Type | Definition -----|--------|-----------|-----------| Page Length | `0x08` | `word` | Defined length of each `PAGE` (min. 512/max. 4096, must be multiple of 512) | Key Count | `0x14` | `word` | Total number of Keys defined | Record Length | `0x16` | `word` | Defined length of a Record (in bytes) | Physical Record Length | `0x18` | `word` | Physical spacing in bytes between Records in the database itself. At least as large as `Record Length`, and likely larger including whatever padding btrieve decides to use. Record Count | `0x1C` | `word` | Total number of Records across all `DATA` pages | Using the above, we can determine that `Page Count` is equal to `File Size / Page Length`. ### Key Definitions Key Definitions are also in the FCR, located starting at offset `0x110`. Each Key Definition record is `0x1E` (30) bytes long _(offsets relative to position within the 30 byte Key Definition)_ Name | Offset | Data Type | Definition -----|--------|-----------|-----------| Total Records | `0x06` | `word` | Total Number of Records which implement/make use of this Key Attributes | `0x08` | `word` | Attributes Flag for the Key (table below) Offset | `0x14` | `word` | Offset within the record which the Key is located (more information below) Length | `0x16` | `word` | Length of the Key within the record Key Data Type | `0x1C` | `byte` | Data Type of the Key (table below) Null Value | `0x1D` | `byte` | For columns marked as allowing NULL (Integers), this value is used to represent NULL ### Key Attributes Each Key Definition has an Attributes Flag which contains multiple attributes about the specified key Attribute | Mask | Definition | ----------|------|------------| Duplicates | `1` | Allow duplicate key values Modifiable | `1 << 1` | Key Value can be modified Old Style Binary | `1 << 2` | Used when `UseExtendedDataType` is not set. If true, binary blob, otherwise string. NullAllSegments | `1 << 3` | Whether the key supports null values. All segmented keys must have this same bit set. SegmentedKey | `1 << 4` | If the given key has any Segments NumberedACS | ` 1 << 5` | Whether the string type uses the ACS table DescendingKeySegment | ` 1 << 6` | Key is stored in descending Sort Order RepeatingDuplicatesKey | `1 << 7` | Different representation of duplicate key values UseExtendedDataType | `1 << 8` | Whether the key uses the key data types listed below, otherwise type is defined by `Old Style Binary`. NullAnySegment | `1 << 9` | Whether the key supports null values. Any segmented key with this value set indicates the entire key could be null. ### Key Data Types Each Key Definition has a defined Data Type, which specifies the underlying type of the Key. While most are primitive, a couple have implicit logic. Attribute | Code (Byte)| ----------|------| STRING | `0` INTEGER | `1` FLOAT | `2` DATE | `3` TIME | `4` DECIMAL | `5` MONEY | `6` LOGICAL | `7` NUMERIC | `8` BFLOAT | `9` LSTRING | `10` ZSTRING | `11` UNSIGNED BINARY | `12` AUTOINCREMENT | `13` NUMERICSTS | `14` NUMERICSA | `15` CURRENCY | `16` TIMESTAMP | `17` The majority of MajorBBS/Worldgroup modules only make use of `STRING`, `ZSTRING`, `INTEGER`, and `AUTOINCREMENT` # Pages After the FCR begins the actual Btrieve Pages. There are different Page Types within Btrieve. Page Type | Definition | ----------|------------| FCR | Contains metadata about the entire database ACS Page | Contains the ACS table used by the database Key Page | Contains the absolute offset in the Btrieve file for the specified Key value Data Page | Contains record data Variable Record Data Page | Contains variable record data While enumerating through the pages, the best method we've been able to employ to identify Page types is the following: _(offsets relative to the start of the Page)_ Page Type | How to Identify | ----------|------------| `FCR` | Always first page `ACS Page` | Always second page if exists. Starts with 0x000001000000AC `Key Pages` | Value `0xFFFFFFFF` at offset `0x8` `Data Pages` | Bit 7 set in the byte at offset `0x5` `Variable Record Data Pages` | No identifying information. Referenced via pointers inside the `Data Pages` # Understanding How MajorBBS/Worldgroup uses Btrieve In it's simplest form, Btrieve allows MajorBBS/Worldgroup Modules to easily save and retrieve `STRUCT` values quickly. A modern comparison would be a document store database for unstructured data similar to Mongo. Data in Btrieve, while it would appear through the Key Definitions that it is structured similar to modern SQL/RDBMS, it's actually a definition of values within a predefined `STRUCT`. For this example we'll use the following `STRUCT`: ```c struct Users{ int userId; char username[50]; char email[50]; char address[50]; char address[50]; } users; ``` If I wanted to quickly look up users based on their `userId` or `username`, I would want to define Btrieve Keys as such: ``` Key Position Type Null Values* Segment Length Flags 0 1 1 2 Integer -- 1 1 3 50 String M -- ``` What we've done now is define the `STRUCT` for our record, and we've now told Btrieve where the information lives within the record. If I wanted to query the Btrieve engine for the username `foo`, I might run a command similar to: `int success = GetByKey(key: 1, keyValue: "foo", result: *destinationStruct)` After executing this command, `success` would be `1` if the command executed successfully, and my desired struct would now be loaded at `*destinationStruct`. The only **_value_** ever retrieved from a Btrieve query is the `struct` which matched the query criteria. It's then up to the developer to programmatically extract the data needed. If I wanted to get the most recent user to sign up, I might run a command similar to: `int success = GetLastByKey(key: 0, keyValue: null, result: *destinationStruct)` This would tell the Btrieve engine to get the **LAST** value (since the key is numeric, _highest_) in the file for the specified key. This would return to me the user with the highest `userId` Btrieve allows MajorBBS/Worldgroup to step through Btrieve records as well relative to the current record. Using the previous query as an example, if I were to execute a query after it similar to: `int success = GetPreviousByKey(key: 0, keyValue: null, result: *destinationStruct)` It would return the `Previous` record, which in our query would mean the user with the _second_ highest `userId` n>