CMSC 691B Topic Ideas
Active learning
Suggested by Prof. Marie desJardins
Description: Machine learning techniques for selecting the most useful
instance to label next.
Relevant publications:
Machine Learning Journal, Journal of Machine Learning
Research, AAAI, IJCAI, Journal of AI Research
- Cohn, Atlas, Ladner, "Improving generalization with active learning,"
Machine Learning 15(2): 201-221, 1994.
- Cohn, Ghahramani, and Jordan, "Active learning with statistical
models," JAIR 1996.
- Tong and Koller, "Active learning for structure in Bayesian networks",
Plan explanation and summarization
Suggested by Prof. Marie desJardins
Description: Methods for automatically producing concise, informative
summaries of plans produced by AI planning systems (particularly
hierarchical task network planners)
Relevant publications: AAAI, IJCAI, Artificial Intelligence Journal, Journal of
AI Research, AIPS
- Clement and Durfee, "Theory for coordinating concurrent hierarchical
planning agents using summary information," AAAI-99
- Weld, "An introduction to least-commitment planning," AI Magazine
- Mulvehill, "Plan comparison and summarization," AIPS-98
- Autobrief,
Distributed Information Retrieval
Suggested by Prof. Charles Nicholas
Description: Techniques for dividing documents, indices, or metadata
among a set of computers linked by a network.
Publications: SIGIR conference, CIKM conference, and sometimes VLDB.
Journals include Information Retrieval, Information Processing and
- J. Callan. Distributed information retrieval. In W.B. Croft, editor,
Advances in information retrieval, chapter 5, pages 127-150. Kluwer
Academic Publishers, 2000.
- The Impact of Database Selection on Distributed Searching
Allison L. Powell, James C. French, Jamie Callan
SIGIR 2000
- L. Gravano, and H. Garcia-Molina. Generalizing GlOSS to Vector-Space
databases and Broker Hierarchies. VLDB Conf., 1995.
Ebiquity Topics
Suggested by Prof. Tim Finin
for a list of project ideas related to the eBiquity lab.
Spam Filtering
Suggested by Prof. Charles Nicholas
Description: Everybody knows what spam is - email that is unsolicited and unwanted
Publications: USENIX Internet Technology conference, IEEE Internet Computing magazine, many web sites.
Lots of papers on information filtering. There are likely also many papers that deal with spam as such.
Evaluation of Clustering Algorithms
Suggested by Prof. Charles Nicholas
Description: Clustering algorithms are used to find group data objects into sets, for which one data object can serve as a representative. But how can one tell if one clustering algorithm produces "better" clusters than another?
Publications: SIGIR conference, CIKM conference, many if not all data mining conferences, statistical literature on cluster analysis
Peter Willet's survey from Information Processing and Management 1986 is seminal but now somewhat dated.
Many papers talk about k-means and the numerous variations.