Convert and mogrify achieve similar goals. convert performs some operation on a file (from changing format to something more complicated) and writes to a new file. mogrify modifies a file in place, and would not normally be used to convert formats.
The two have similar signatures:
seqmagick convert [options] infile outfile
vs:
seqmagick mogrify [options] infile
Options are shared between convert and mogrify.
convert can be used to convert between any file types BioPython supports (which is many). For a full list of supported types, see the BioPython SeqIO wiki page.
By default, file type is inferred from file extension, so:
seqmagick convert a.fasta a.sto
converts an existing file a.fasta from FASTA to Stockholm format. Neat! But there’s more.
A wealth of options await you when you’re ready to do something slightly more complicated with your sequences.
Let’s say I just want a few of my sequences:
$ seqmagick convert --head 5 examples/test.fasta examples/test.head.fasta
$ seqmagick info examples/test*.fasta
name alignment min_len max_len avg_len num_seqs
examples/test.fasta FALSE 972 9719 1573.67 15
examples/test.head.fasta FALSE 978 990 984.00 5
Or I want to remove any gaps, reverse complement, select the last 5 sequences, and remove any duplicates from an alignment in place:
seqmagick mogrify --tail 5 --reverse-complement --ungap --deduplicate-sequences examples/test.fasta examples/test.fasta
You can even define your own functions in python and use them via --apply-function.
Note
To maximize flexibility, most transformations passed as options to mogrify and convert are processed in order, so:
seqmagick convert --min-length 50 --cut 1:5 a.fasta b.fasta
will work fine, but:
seqmagick convert --cut 1:5 --min-length 50 a.fasta b.fasta
will never return records, since the cutting transformation happens before the minimum length predicate is applied.
Traceback (most recent call last):
File "../seqmagick.py", line 7, in <module>
sys.exit(cli.main(sys.argv[1:]))
File "/var/build/user_builds/seqmagick/checkouts/0.6.0/seqmagick/scripts/cli.py", line 12, in main
action, arguments = parse_arguments(argv)
File "/var/build/user_builds/seqmagick/checkouts/0.6.0/seqmagick/scripts/cli.py", line 58, in parse_arguments
for name, mod in subcommands.itermodules():
File "/var/build/user_builds/seqmagick/checkouts/0.6.0/seqmagick/subcommands/__init__.py", line 7, in itermodules
__import__('%s.%s' % (root, command), fromlist=[command]))
File "/var/build/user_builds/seqmagick/checkouts/0.6.0/seqmagick/subcommands/convert.py", line 8, in <module>
from Bio import Alphabet, SeqIO
ImportError: No module named Bio