(For those who have taken the courses and want to submit for evaluation, please read the instructions linked on the table of contents page.)
- Database integration is one of the most needed information technologies for the future of genomics research as also noted in the Nature article above. A new data model called XML is now under rapid development and being adopted by the bioinformatics community. It is especially well suited for data integration. The knowledge about XML is becoming a common necessity for biologists as for all other areas with large amount of heterogeneous, interconnected data. For you to begin with XML, write an XML document (either a well-formed XML or a valid XML with a separate DTD) to store the results from a type of molecular biology experiment of your choice.
- Since “a very long list of something” is now commonplace in biological research, the computational tool which deals with it is a necessity for biologists. This is what is called DBMS. For you to get started with DBMS, write relational schema to store many instances of the experimental result used for the XML problem above. Also, write two or more SQL queries for some common tasks with the data.
- Biological data are someway connected to each other. Therefore, database integration is one of the key issues in bioinformatics at present. You, as the one who is (or will become) responsible for bioinformatics in your group, must have the understanding and the practical knowledge for the job. Currently, the most common way to solve this integration problem is using simple cross reference links. (Of course, there are much more advanced methods for this problem as I mentioned in the lecture.) Visit the NCBI site and find out how they do on this problem. Briefly describe what you found.