#!perl -w use strict; =head1 NAME bioref.pl - example of bio data using Perl references =head1 SYNTAX perl bioref.pl biodata.txt =head1 NOTES Use easiest Perl syntax for references. Show how to load data and display it. Show how references are created (implicitly and explicitly). Identify best reference practices. o Best to stick to references completely o Best to use {} to create an empty hash reference o Best to use [] to create an empty array reference o Best to use %$name and @$name o Best to use -> notation o Use { ... } to surround complicated names %{$table->{$species}} or @{$table->{$species}{$gene}} o Let Array or Hash operations implicitly create references o Double up names for arrays of arrays $table->[$inner][$outer] o Double up names for hash of hashes $table->{$species}{$gene}{$fx} o Mix up names for mixed references $table->{$species}{$gene}[$index] Data take from Gene Cards as referenced by the human gene compendium. http://www.genecards.org/ Each gene was chosen for interest and common use. =head1 DATA Human, ABO, Glycosyltransferases Human, ABO, psuedogene Human, ALB, Serum Albumin Drosophila, FUZ, fuzzy homolog Human, BCL2, Apoptosis regulator Mouse, ZFP91, zinc finger protein Human, INS, Insulin Human, HOXA@, Homeobox A cluster =head1 OUTPUT Drosophila: FUZ: fuzzy homolog. Human: ABO: Glycosyltransferases, pseudogene. ALB: Serum Albumin. BCL2: Apoptosis regulator. HOXA@: Homeobox A cluster. INS: Insulin. Mouse: ZFP91: zinc finger protein. =cut # Create a reference to a HASH table my $table = {}; # Create a hash of $species to $genes # enable $descriptions of said $genes my ($species, $gene, $description); while (<>) { chomp; ($species, $gene, $description) = split /, /; # Perl implicitly does this as it goes ... # $table->{$species} = {} unless exists $table->{$species}; # $table->{$species}{$gene} = [] unless exists $table->{$species}{$gene}; push @{$table->{$species}{$gene}}, $description; } foreach $species (sort keys %$table) { print " $species: "; print "\n"; foreach $gene (sort keys %{$table->{$species}}) { my @description = @{$table->{$species}{$gene}}; print "\t$gene: ", join ', ', sort @description; print ".\n"; } } __END__ =head1 NOTES Source: Gene Ontology Consortium Creates vocabularies which can be applied to various databases. Plants, traits, genes, sequences, mammalian phenotype, mouse, cell type, ... by molecular function, biological process, cellular component MF - describe activities BP - one or more MF events != pathway CC - anatomical structure or product group endoplasmic reticulum or proteasome, ribosome, protein dimer http://www.geneontology.org/ Protein, Function, Location P is_expressed_in L P is_up_regulated_in F P is_down_regulated_in F F affects L P is_a_target_for F =cut