#!perl -w
use strict;
=head1 NAME
bioref.pl - example of bio data using Perl references
=head1 SYNTAX
perl bioref.pl biodata.txt
=head1 NOTES
Use easiest Perl syntax for references.
Show how to load data and display it.
Show how references are created (implicitly and explicitly).
Identify best reference practices.
o Best to stick to references completely
o Best to use {} to create an empty hash reference
o Best to use [] to create an empty array reference
o Best to use %$name and @$name
o Best to use -> notation
o Use { ... } to surround complicated names
%{$table->{$species}} or @{$table->{$species}{$gene}}
o Let Array or Hash operations implicitly create references
o Double up names for arrays of arrays $table->[$inner][$outer]
o Double up names for hash of hashes $table->{$species}{$gene}{$fx}
o Mix up names for mixed references $table->{$species}{$gene}[$index]
Data take from Gene Cards as referenced by the human gene compendium.
http://www.genecards.org/
Each gene was chosen for interest and common use.
=head1 DATA
Human, ABO, Glycosyltransferases
Human, ABO, psuedogene
Human, ALB, Serum Albumin
Drosophila, FUZ, fuzzy homolog
Human, BCL2, Apoptosis regulator
Mouse, ZFP91, zinc finger protein
Human, INS, Insulin
Human, HOXA@, Homeobox A cluster
=head1 OUTPUT
Drosophila:
FUZ: fuzzy homolog.
Human:
ABO: Glycosyltransferases, pseudogene.
ALB: Serum Albumin.
BCL2: Apoptosis regulator.
HOXA@: Homeobox A cluster.
INS: Insulin.
Mouse:
ZFP91: zinc finger protein.
=cut
# Create a reference to a HASH table
my $table = {};
# Create a hash of $species to $genes
# enable $descriptions of said $genes
my ($species, $gene, $description);
while (<>) {
chomp;
($species, $gene, $description) = split /, /;
# Perl implicitly does this as it goes ...
# $table->{$species} = {} unless exists $table->{$species};
# $table->{$species}{$gene} = [] unless exists $table->{$species}{$gene};
push @{$table->{$species}{$gene}}, $description;
}
foreach $species (sort keys %$table) {
print " $species: ";
print "\n";
foreach $gene (sort keys %{$table->{$species}}) {
my @description = @{$table->{$species}{$gene}};
print "\t$gene: ", join ', ', sort @description;
print ".\n";
}
}
__END__
=head1 NOTES
Source: Gene Ontology Consortium
Creates vocabularies which can be applied to various databases.
Plants, traits, genes, sequences, mammalian phenotype, mouse,
cell type, ...
by molecular function, biological process, cellular component
MF - describe activities
BP - one or more MF events != pathway
CC - anatomical structure or product group
endoplasmic reticulum or proteasome, ribosome, protein dimer
http://www.geneontology.org/
Protein, Function, Location
P is_expressed_in L
P is_up_regulated_in F
P is_down_regulated_in F
F affects L
P is_a_target_for F
=cut