Use of Ontologies and Probabilistic Relational Models to Aid in Cyber Crime Investigation Decision Support
The field of computer forensics is continuously evolving and is still in its infancy when compared to other fields in the world of information technology. If one were to compare computer forensics to fields such as data mining, the immaturity of forensics becomes even more apparent and one can see that much more research is needed. Indus-try has developed many automated tools to perform data mining techniques but each implements a proven concept that has been validated. Even though these data mining concepts may have not been applied to ontologies, the precepts that the software is based on are defined in basic database theory and schemas. Software tools for computer forensics have been developed to automate the manual evidence discovery and capture procedures currently used by investigators. These same software packages provide some rudimentary capability to speed up the search process but provide no support in identifying potential evidence. The problem lies in that these tools can only be used after an investigator has shown "probable cause" to a judge and obtained a search warrant allowing the seizure of detailed evidence. Only then can these tools be used on the seized data. Before then, investigators must rely on what is reported to them by victims or bystanders that a possible crime has been committed. Investigators depend on their own experience or that of their peers to determine if a crime has been committed, whose jurisdiction it falls under, and if there is sufficient information to proceed with an investigation. Computer forensics is the same as any other form of criminal investigations. Computer data must be preserved, identified, extracted, documented and interpreted. It is still more of an art form than a sci¬ence though its parishioners follow well-defined and clear procedures and methodologies. The field investigator relies on data provided as in any other crime as well as that obtained from incident response teams in addition to network monitoring tools performing "trap and trace" functions2 to show "probable cause" to a judge in order to justify a search warrant. Only after a search warrant is obtained can evidence be seized. At this point, a detailed investigation of the seized material is conducted to support prosecution. The purpose of this dissertation is to describe a decision support methodology that may be used to determine if there is sufficient information to demonstrate probable cause and validate the completeness of the evidence obtained. This methodology would become the foundation of a framework where information about a crime that was committed may be shared amongst investigators from various law enforcement agencies and industry, prosecutors and other litigators, and analysts to track digital evidence of crimes or, through trend analysis, identify when investigative resources need to be reallocated. Within the primary purpose of this dissertation, there are two goals. The first goal is to describe several laws that describe the criminal use of computers using an ontology-modeling tool. Two of the crimes modeled will have similar elements of proof while the third ontology model will describe a very different law. The second goal is to postulate a probabilistic model that will be applied to evidence that may be described using the at tributes identified within the ontologies. The probabilistic model will be applied against the evidence. Using a "best fit" inference methodology, the model results should identify whether a crime has been committed, and which crime has been committed. The degree of fit will identify if there is sufficient evidence to justify "probable cause" for a search warrant. The three crimes were chosen to see if the models have sufficient granularity to identify which crime has been committed. By identifying the correct crime, this same model will identify which law enforcement agency is responsible for investigating that crime since the enforcement of specific laws (i.e. the investigation of the corresponding crimes) are assigned to specific law enforcement agencies. Future research would allow modification of the probabilistic model to assist investigators in determining if there is sufficient evidence for prosecution. The ontologies need to be sufficiently robust as to allow the description of statutes of crimes that are similar in nature but fall into a different jurisdiction. In this way, the probabilistic model can be used to identify whether an investigation needs to be continued or passed on to another jurisdiction's law enforce-ment agency for continued investigation, thereby freeing the initial agency's assets for use elsewhere.