Final year B.Sc. projects 2004-2005

Tom Naughton, Room 2.104,

******************************************************************

Tom Naughton
Title: Remote interface for a distributed computing system
Pre-Requisites: Good programming skills; Interest in working in a group environment where high standards are demanded; Level of project difficulty: medium
Description: We have developed a general-purpose distributed computing system (http://www.cs.may.ie/distributed/) at the Department of Computer Science. As part of this system, we have developed a basic remote interface for the system and all interaction with the system is performed via the remote interface. The remote interface is used for tasks such as adding jobs, removing jobs, monitoring running jobs, downloading results, and getting feedback on the status of the distributed system.

We would like to greatly expand the functionality available through the remote interface such as:

  • full multi-user support (i.e. user logins with functionality to add users, remove users, monitor users, view user history etc.)
  • presentation of detailed information on:
    • every running computation
    • every donor machine (amount of processing performed, number of units processed, uptime/downtime, etc.)
    • overall status of the distributed system, number of donor machines, estimate of system performance (e.g. speed in pentium years), etc.) possibly via a graphical representation
    • information of the set of donor machines (operating systems, locations, etc.)
      It should be noted that all of this information can be easily extracted from the existing system log files
  • improve usability and user friendliness of remote interface
  • addition of a help index for inexperienced users
  • addition of secure communications with the server

This is not an extensive list and the student can make their own suggestions for improvements/enhancements to the remote interface. The student will be given our existing code for the remote interface (doesn't have to be used) and the main task of this project will be to implement as many of these features as is possible. The student will be expected to produce a fully working program by the end of this project. The student will be co-supervised by Ph.D. student Thomas Keane. If you have any questions, or would like to meet and talk about the project, please feel free to contact me at .

Language(s): Java
References: Papers describing the distributed computing system are available at http://www.cs.may.ie/distributed/.
Availability: Available

******************************************************************

Tom Naughton
Title: Top 500 Supercomputers
Pre-Requisites: Good programming skills; Level of project difficulty: high
Description: The Top 500 (top500.org) is a twice yearly updated list of the world's most powerful supercomputers. The measurement that is used to determine the processing power of each supercomputer is the Linpack matrix multiplication library. We have developed a general purpose distributed computing system (http://www.cs.may.ie/distributed/) in the Department of Computer Science. This system has been operational for a number of years and we have also developed a number of successful distributed applications to run on this system.

The main tasks of this project are to examine the Linpack library and determine the feasibility of running this measurement on our distributed computing system. Using our programming interface (see the developer manual), the student will be expected to produce a distributed program that will execute the Linpack program (or equivalent code) on our distributed computing system to determine the performance of our system. The student will be co-supervised by Ph.D. student Thomas Keane. If you have any questions, or would like to meet and talk about the project, please feel free to contact me at .

Language(s): Java
References: Papers describing the distributed computing system are available at http://www.cs.may.ie/distributed/.
Availability: Available

******************************************************************

Tom Naughton
Title: Distributed sequence alignment
Pre-Requisites: Good programming skills; Three years of university-level biology; Level of project difficulty: high
Description: Sequence alignment is one of the most fundamental tasks in bioinformatics. The goal of sequence alignment is to identify similar regions in DNA, RNA, or protein sequences. The Smith and Waterman alignment algorithm has been widely acknowledged as being the most accurate technique for aligning two sequences. However Smith-Waterman has a high space and time complexity, (nm), where n and m are lengths of the sequences being compared, meaning that for large databases of sequences it can be time-consuming to perform a full Smith-Waterman alignment using only a single processor. Recently, we completed the development of a distributed alignment application, called DCPal (http://www.cs.may.ie/distributed/), which implements the Smith-Waterman algorithm. DCPal allows the user to distribute the task of aligning a set of query sequences against a larger database of sequences over a set of semi-idle processors. We have completed a full performance analysis that demonstrates the potential of DCPal to speedup long alignment computations.

However, DCPal is still in the early stages of development and we have identified a number of areas in which the program that could be expanded:

  • support for all of the popular bioinformatics database formats (Genbank, EMBL, etc.)
  • addition of other alignment algorithms (Needleman-Wunsch, Blast, Fasta, etc.)
  • support for all the common alignment output formats (e.g. Blast format results)
This is not an extensive list and the student can make their own suggestions/enhancements. The student will be given our existing DCPal code and main task of this project will be to implement as many of these features as is possible. The student is expected to produce a fully working program by the end of the project. The student will be co-supervised by Ph.D. student Thomas Keane. If you have any questions, or would like to meet and talk about the project, please feel free to contact me at .
Language(s): Java
References: Papers describing the distributed computing system are available at http://www.cs.may.ie/distributed/. A preprint of the journal paper describing DCPal is also available.
Availability: Available

******************************************************************

Tom Naughton
Title: Research literature organising tool
Pre-Requisites: Good programming skills; Level of project difficulty: medium to high
Description: A software tool has been developed (by a previous final year student Pierce Gleeson) to organise the research literature of a research group. The tool manages a database of research papers via a WWW interface. Further enhancements are required to this tool. Currently research group members add/remove/update entries in the database. Each entry includes a BibTeX description of the paper, an abstract, a list of keywords, a PDF copy of the paper, and comments on the paper from research group members.

A query builder where a researcher can easily perform complex searches is required. Also a dynamically generated graph where every node is an author and every edge is a collaboration with another author will allow researchers to view patterns of collaboration; identifying the most important authors in their field.

You can avail of guest access to the first version of the tool. The student will be co-supervised by Ph.D. student Andrew Page. If you have any questions, or would like to meet and talk about the project, please feel free to contact me at .

Language(s): PHP and MySQL
References: Pierce Gleeson's thesis is available.
Availability: Available

******************************************************************

Tom Naughton
Title: Massively distributed computing
Pre-Requisites: Good programming skills; Level of project difficulty: high
Description: We have developed a general purpose distributed computing system (http://www.cs.may.ie/distributed/) at the CS Department. The basic idea behind this system is that our client software is installed on a donor machine and run in the background using up the 'idle' clock cycles of the donor machines to perform some large scientific computation. The system has been used for bioinformatics applications (DNA alignment and building massive phylogenetic trees), biomedical engineering applications (modelling the path of photons through the brain), and for Cryptography applications (breaking Elgamal keys, and brute-forcing MD5 passwords).

SETI@home is a similar distributed system and over 3 million people around the world have donated their machines free clock cycles to the project. The main difference between this system and ours is that it is special-purpose and ours can be programmed to perform any task.

We wish to adapt our system to allow it to be run on top of a Grid such as Condor or Globus. This would allow us to access massive computing resources on existing Grids. The student will be co-supervised by Ph.D. student Andrew Page. If you have any questions, or would like to meet and talk about the project, please feel free to contact me at .

Language(s): Java
References: Papers describing the distributed computing system are available at http://www.cs.may.ie/distributed/.
Availability: Available

******************************************************************

Tom Naughton
Title: Three-dimensional holographic video codec
Pre-Requisites: Good programming skills; An interest in image processing; Level of project difficulty: high
Description: Do you remember R2D2's hologram message of Leia Organa in Episode IV? Ever wonder if technology was going in that direction? Three-dimensional television and video is currently the subject of intensive research. One promising technique for 3D TV uses digital holography. A digital hologram stores multiple different perspectives of the same real-world object. This allows us to reconstruct the 3D object either digitally (in software) or optically. In the last five years, digital imaging technology has advanced to a stage where the capture, transmission, and display of digital holograms is now a realistic possibility.

A digital holograpic 'camera' has already been built and we have a small database of 3D frames and video clips. The next step is to look at networking issues. We have experimented with coding and compression of digital holograms and want to extend this to video. One or two creative, imaginative, and highly-competent students are required to join our group in the development of the first MPEG-style codec (compressor-decompressor) for digital holographic video. It is hoped that by the end of the project, Internet transmission of holographic video could be demonstrated. If you have any questions, or would like to meet and talk about the project, please feel free to contact me at .

Language(s): Matlab and/or Java
References: References: Recent presentations on single-frame digital hologram compression and Internet transmission of digital holograms. Two final year projects on this topic from last year are available for viewing. Also, an interesting book on 3D TV and Display Technology can be borrowed from the NUIM library.
Availability: Available

******************************************************************

Tom Naughton
Title: An automated teaching tool for computer theory
Pre-Requisites: An interest in computer theory and machine learning; Level of project difficulty: medium to high
Description: Multiple-choice test results are particularly suitable for machine analysis. By restricting a user's responses to a finite set of possibilities, it is possible to write an automated teaching tool that guides the user onto more difficult questions if they get questions correct, and onto easier questions otherwise. In an online system, the user's response to a question could be evaluated there and then and their next question chosen immediately. There could be a number of levels of easier questions, depending on "how wrong" the user's response was. This allows users to learn at their own pace and evaluate their progress through a course of study. Where a user chooses incorrectly, they might be referred to a short tutorial explaining why their choice was incorrect, or be referred to a particular page in the course notes.

Such a web-based automated teaching tool was built in a previous year by Des Traynor (currently a PhD student). It consists of (i) a subject-independent teaching architecture, (ii) a completely separate "web of knowledge" of computer theory questions, solutions, and provision for short tutorials, and (iii) a system for recording students grades. This teaching tool will be employed during computer theory lab sessions starting November 2004. A student is required to become familiar with the system, operate it in a lab environment for six weeks, and make any improvements as required. Full documentation will be provided. For the remainder of the project, the student will analyse the effectiveness of the system and design improvements. For example, the set of questions could be increased in size by designing a random choice generator. At the student's discretion, this project could also allow scope for an investigation into the relationship between the limits of artificial intelligence and the limits of automated teaching. If you have any questions, or would like to meet and talk about the project, please feel free to contact me at .

Language(s): Perl
References: References: Des Traynor's final year thesis is available on request.
Availability: Available

******************************************************************


NUIM Logo