|
|
Research Interests
Introduction
As a PhD student, I am investigating the problem of inducing
Stochastic Graph Grammars,
and of exploring the utility of such grammatical models.
I am pursuing research at the
CORAL Lab at UMBC, under the
supervision of Dr. Tim Oates .
Here, I will briefly explain what graph grammars are, and why they are
important (and interesting!).
Graph Grammars
A grammar is a set of rules that describes a language. A language is a
collection of sentences; a sentence, in turn, is a linear
sequence of words. A graph grammar, on the other hand, describes languages
that are collections of graphs, instead of linear sentences. In a few minutes,
I will describe why such grammars are useful; however, first we need
a few more definitions.
Stochastic Graph Grammars
A grammar tells us whether a given sentence belongs to a language or not.
A stochastic grammar, additionally, gives us the probability that
the sentence was generated by the grammar. Similarly, a stochastic graph
grammar tells us how probable it is for a given graph to have been
generated by the grammar.
A stochastic graph grammar is, thus, a model for a probability
distribution of graphs. Such models are typically useful in scenarios where
the graphs have a hierarchical structure. Non-terminal symbols represent
subgraphs that behave as a unit; such a subgraph can itself be
a collection of smaller subgraphs, that are units in their own right. The
hierarchical and recursive nature of grammars makes them ideal as models
for such distributions. As the next section shows, there are several
real life scenarios where graph grammars provide a feasible tool for
probabilistic modeling.
Applications
We now consider the following domains in which stochastic graph grammars
are useful.
-
Understanding the Structure of Large Chemical Compounds
Although the structure of many organic compounds, such as proteins, may
be huge, such structures tend to be heirarchical in nature. A molecule is
composed of a collection of functional groups , which are groups of
atoms that behave as a single unit. A functional group, in turn, may be
composed of smaller functional groups. Finally, it is often the case that
functional groups can be partitioned into classes, where all members of the
same class can be represented by the same formula. A simple example is
CnH2n+1 that represents all alkyl groups.
These properties clearly indicate that organic molecules
are amenable to grammatical representations.
How are such models useful?
Once a grammar has been learned for a family of
such molecules, we can generate samples from the underlying distribution of
those molecules, assisting in compound design. Besides, given a molecule,
we can calculate the probability that it was generated by this distribution.
Intuitively, more stable molecules are more probable. Thus, such analysis
can give us insight into the stability of the compound.
-
Understanding and Classifying Social Networks
Social networks can be represented as graphs, where nodes represent
individuals, and edges represent acquaintance. Like organic molecules,
social networks, too, have a hierarchical, recursive structure. A network
may be composed of several communities; a community itself maybe a collection
of several smaller communities.
If a family of social networks can be modeled using a stochastic graph grammar,
then, given a new network, we can tell how likely it is that the network belongs to
the same family. This can be useful in classifying networks, and, in partcular,
identifying malicious ones.
|