ACSYS BSC Limited

Podstawowe informacje o firmie

Products: Crime Workbench

products produkty

Analyst's Notebook
iBase
iBridge
Pattern Tracer TCA
RetrievalWare
Crime Workbench
CW i2 Plugin
Memex Intelligence Engine
Data Entry Module
Universal Content Management
IBPM

software solutions

scope of service

download

software service

Podstawowe informacje o firmie

ACSYS BSC Ltd

Producer: Memex Technology Limited

Modules: Crime Workbench Server (includes Memex Intelligence Engine); Crime Workbench Client

Crime Workbench integrates with: Data Entry Module, i2 Analyst’s Notebook 5.0.25 & i2 Analyst’s Notebook 6.

Crime Workbench (CWB) developed by a British company Memex Technology Limited in co-operation with ACSYS BSC allows creation, management and use of textual databases employed in criminal analysis process. It is particularly useful in multi-thread cases covering large territorial area, with complex structure of criminal associations and large amount of information, where you are not capable of following and associating facts using traditional or less sophisticated methods, in order to build and then verify or eliminate investigation hypothesis. Main system features are as follows:

database creation and management;
collecting descriptive information about entities and events;
creating structural information about persons and other entities (address, vehicle etc.);
defining links that show associations between entities;
management of complex cases;
searching for information, based on extensive query language;
criminal analysis with the use of visualization techniques (full use of the graphical analysis functions requires installation of CWB i2-Plugin extension module, and the i2 Analyst’s Notebook;
managing user access to the gathered information;
extensive data access monitoring function;
action management;
automatic import of data from external information storage systems.

All system features are available from the Crime Workbench Client application level.

Crime Workbench is an operational and strategic intelligence tool allowing powerful data storage, searching, retrieval and analysis. It is specifically designed for the law enforcement, intelligence and fraud investigation communities. Crime Workbench is a client–server system. The clients can use the resources of one or more Crime Workbench servers. The server software can run on a Solaris platform (Solaris 2.7 or Solaris 2.8).

There are two client programs for connecting to the server:

Crime Workbench Client – a standard Windows application that includes the complete set of features for using and administering the system.
CW i2 Plug-in – a plug-in to i2 Analyst’s Notebook.

The client application provides a user interface to the powerful search facilities of the MIE (Memex Intelligence Engine). The MIE is installed on the server together with the Memex databases in which all of the intelligence data is stored.

System architecture

Crime Workbench installation uses a combination of physical and logical servers. The most common type of installation is a single physical server supporting several logical servers.

The main types of installation are the following:

Centralized – This is a single-server installation. It consists of one physical server supporting one logical server with one set of data.
Logically Distributed – This is the most common type of installation and consists of one physical server with multiple logical servers.
Physically Distributed – In specific circumstances it is useful to have a system that is made up of more than one physical server. Each physical server contains one or more logical servers to which users can write, plus read-only copies of the logical servers on the remote physical servers. (Each physical server on which Crime Workbench is installed must also host an MIE installation. The MIE handles most of the database management tasks for Crime Workbench.)

Crime Workbench Components

Crime Workbench is a client–server system. The client application is a Windows-based program that provides users with a graphical interface to the functions provided by the server.

On the server side, Crime Workbench installation comprises one or more logical servers. The logical servers can be grouped into two types:

The configuration server
Non-configuration servers

Configuration server

Every Crime Workbench installation requires one configuration server. The configuration server is a special instance of a logical server – as well as holding intelligence data, the configuration server stores information relating to the configuration of the system (e.g. user settings and entity definitions). It is also the point of entry to the system during the login process.

The configuration server contains:

Appserver

The appserver (application server) performs certain server-side tasks for the client application – for example, adding and removing users from the system, assigning permissions to users, and changing passwords.

Management databases

These databases (mxAction, mxAuditnn, mxCase, mxDisseminate, mxEntity, mxLinkDB, mxServer and mxUserGroup) contain a variety of system information. With the exception of mxDisseminate they are not displayed in the client application’s Search Manager and so cannot be searched in the same way as intelligence databases.

Intelligence databases

The information input by users, via the Crime Workbench client application, is stored in intelligence databases, such as the Report, Address and Vehicle databases.

Non-configuration servers

Crime Workbench installation may contain multiple non-configuration servers. Using non-configuration servers allows you to have more than one database of the same type. For example, if you want to have several Address databases, you must put each one on a separate server.

Each non-configuration server contains:

The intelligence databases

Typically, though not necessarily, the same selection of databases as those on the configuration server.

Two management databases

Only the mxAuditnn and mxDisseminate management databases are included (depending on a user’s permissions). All of the configuration data is centrally located on the configuration server.

Archive server

The archive server is a special type of non-configuration server, used to store deleted records. If an archive server is installed, whenever a user deletes a record, it is moved to the matching database on the archive server. For example, if you delete a record in an Address database it is moved to the Address database on the archive server. (Typically, access to the archive server is only given to administrative users.)

Introduction to Memex Intelligence Engine

The Memex Intelligence Engine is a unique technology that has been developed specifically to meet the needs of organizations involved in law enforcement, national security and combating commercial fraud. It is neither a relational database management system nor a simple text indexing system, but a secure hybrid design with the benefits of both of these technologies.

The Memex Intelligence Engine is the backbone for Memex’s Crime Workbench.

Behind the scenes, the MIE provides the majority of the data input, search, retrieval and database management functions for these applications.

Structured and free-text database systems

Today’s databases are primarily relational systems, based on the concept of highly structured data. These systems are very prescriptive about the types of data you can store within them and the methods you can use to access the data efficiently. As an example, records about a person will allow only one date of birth, and will not allow a partial or approximate date of birth. Typically, data can only be searched by some of the fields and little or no “fuzzy” searching is available.

By contrast, indexed, free-text systems allow you to search for any information within text documents. However, these systems also have their limitations. For example, the data must be completely indexed – an operation that takes some time. This prevents information from being searchable immediately on entry. Wholly free-text systems also offer no support for applying structure to data where you do require it, and little, if any, support for data security.

The Memex Intelligence Engine enables you to provide as little, or as much, structure to your data as you need. It allows you to search every part of your data without the need to define indexes, and it provides a framework of security and auditing for all functionality.

Basic concepts of Memex Intelligence Engine

Instead of each character having a numeric code between 1 and 255, as in the original ASCII text, the MIE converts incoming text data into numbers, through a process called tokenisation. Each unique word, text unit (such as an acronym), symbol or punctuation mark in a database is treated as a token. Each token is given a numeric code, which is assigned the first time the token is added to the database. The tokens are stored as variable-length bytes, with most tokens being assigned either a 1-byte or 2-byte code. In the default configuration, numbers in the incoming text are stored as numbers, rather than as tokens.

Additional information is stored about capitalization, document breaks, field separators and numbers. This information is also tokenised.

Once data is stored in coded form in a Memex database it can be searched very rapidly. The Memex Intelligence Engine searches for the small byte sequences that represent the required words. This is a far quicker process than searching the original raw text, and is aided by the fact that using tokenisation compresses the original text by up to 70 per cent.

To further improve performance, the database is split into separate files called segments. Data is indexed to segment level, so that the MIE can quickly determine which segments contain a given word, or combination of words. This allows the MIE to limit its search to those segment files that contain the word, or words, in the user’s search expression. The segment index (which is also known as the navigate and search map, or, more simply, as the map file) allows rapid elimination of a great deal of data, and means that typically only 5–10 per cent of the data needs to be scanned. Additionally, when a database does not contain the word or phrase being searched for, the MIE can determine this without having to search any of the coded segments.

How the system is organized

The Memex Intelligence Engine and the applications it supports are organized in a client–server relationship. Search facilities and data access controls are provided to client applications – such as Crime Workbench and CWB i2 Plug-in – by a set of server-side programs. The MIE can also be accessed by custom-built solutions, using one of the programming APIs.

The MIE itself consists of a central server program and a suite of programs and utilities that carry out specific tasks. Data is stored in one or more Memex databases, which are typically located on a single physical server.

Client applications connect to the MIE via TCP/IP, specifying the server’s host name and the network port on which the MIE listens for connections. Typically, the MIE uses port 590, although another port can be chosen, if required.

The Memex server

The MIE runs as a two processes called aisvr and bcp. The tasks performed by the Memex server fall roughly into two categories:

Application-related

Controlling all database access, input and maintenance on behalf of client applications. Where appropriate, the Memex server also provides user authentication and handles the connection of networked applications.

Search-related

Performing all search operations on Memex databases. After processing the search commands it receives from client applications, the Memex server runs the appropriate search, provides information on the status of the search, and returns any hits it finds.

The Memex server is usually set up to start automatically when the computer hosting the MIE is rebooted. This is achieved by using a startup script, on a UNIX server.

Other parts of the system

An installed MIE system consists of a large collection of programs and files. The following list briefly describes the main constituent parts of the system – other than the Memex server and its configuration file – and indicates where you can find more information.

Databases

Memex databases comprise a number of files, including:

One or more coded segment files (often referred to as “cod” files):

database configuration file, containing definitions of fields, level separators, the security classification for the database, etc.;
vocabulary file;
map file;
database registry.

The MIE’s Registry file is a text file that can be used to identify and organize the databases that are available to an application.

For each database, the registry contains such information as:

creation date/time of the database;
host name of the computer on which the database is located;
path to the directory in which the database files are stored;
unique number that applications use to identify the database, rather than using the path, thereby making it easier to change the physical location of the database.

Security file(s)

Security is applied to data using a system of locks and keys. The locks, together with the specification of which keys each user possesses, are defined in a security class, within a security file. Each database is assigned a security class, from which it takes its security settings. Applications may use a number of security classes, stored in one or more security files – for example, assigning a particular security class to a particular type of database. Alternatively (as is the case with Crime Workbench ), an application may use a single security class in a single security file.

The file path and name of the security file is specified as the Memex server’s configuration parameter.

Security journal

The security journal is a low-level log of database events. It is a text file, but identifies logged events using a code, to keep the size of the file to a minimum. The file path and name of the security journal, and the events it logs, are specified as the Memex server’s configuration parameter.

Error log

The MIE logs all errors in a text file. The file path and name of the file are specified as the Memex server’s configuration parameter. The error log also records configuration information about the system each time the Memex server is started. For debugging purposes, the Memex server can be configured to log all Memex API function calls to the error log.

Query expanders file

This text file allows users to access functions directly from the query line, rather than via an application control. For example, the function mx_phonetic can be mapped to the word SOUNDS, allowing a search such as (simpson)SOUNDS.

The file also maps Boolean operators to words. This allows users to enter a query such as dog NOT spaniel, instead of dog ! spaniel. The file is also used to specify the file path and name of the default thesaurus file.

The file path and name of the query expanders file are specified in the Memex server’s configuration file.

Thesaurus file

The MIE can search for words that have a similar meaning to a specified word. In order to perform synonym expansion, the system must have access to at least one thesaurus file. Each database can have a thesaurus assigned to it by specifying the path to the thesaurus file in the database’s configuration file. If a database does not have an individually assigned thesaurus, a default thesaurus is used. The default thesaurus is specified in the query expanders file.

Link repositories

A link repository is a special kind of database, designed for storing information about links between records. Crime Workbench , for example, uses three link repositories for storing the details of explicit, implicit and case links. Link repositories use a proprietary format and the data they contain cannot be modified in any way, other than via the MIE.

Query histories

A query history is a collection of information concerning searches run by a user. Each query history entry stores the details an individual search, including the results returned by that search. Each user can have one or more query history, created and maintained by a client application. Typically, query histories are created in a directory used for temporary files – for example, /tmp.

Utilities

The MIE comes complete with a set of tools for manipulating or repairing databases, loading/dumping data to/from a link repository, and adding query expanders.