SYNOPSIS
src2srcml [-hVnizcgv] [-l language] [-d directory] [-f filename] [-s
version] [-x encoding] [-t encoding] [input-source-code-
file]... [-o output-srcML-file]
DESCRIPTION
The program src2srcml translates source-code files into the XML source-
code representation srcML. The srcML format allows the use of XML for
addressing, querying, and transformation of source code. All text from
the original source-code file is preserved including white-space, com-
ments, and preprocessor statements. No preprocessing of the source code
is done. In addition, the tool can be applied to individual source-code
files, or code fragments.
The translation is fast and uses a stream-parsing approach with top-
down parsing and elements issued as soon as they are detected.
The program src2srcml is typically used with srcml2src which converts
from the srcML format back to source code. Conversion of a source-code
file through src2srcml and then through srcml2src produces the original
source-code file. The program srcml2src also provides a set of utili-
ties for working with srcML files, including efficient querying and
transformation of source code.
Standard input may be specified by using the character - in the place
of an input source-code file or providing no input source-code file.
Similarly, standard output may be specified with the - character for
the output srcML file or by not providing an output srcML file.
A source-code language was be specified when input is from standard in-
put.
OPTIONS
-h, --help
Output the help and exit.
-V, --version
Output the version of src2srcml then exit.
-e, --expression
Translates a single, standalone expression.
-n, --archive
Stores all input source files into one srcML archive. Default
with more then one input file, a directory, or the --files-from
option.
--files-from
Treats the input file as a list of source files. Each file is
separately translated and collectively stored into a single sr-
cML archive. The list has a single filename on each line start-
the command iconv -l.
--no-xml-declaration
No output of the default XML declaration. Useful when the output
is to be placed inside another XML document.
--no-namespace-decl
No output of namespace declarations. Useful when the output is
to be placed inside another XML document.
-z, --compress
Output is in compressed gzip format. This format can be direct-
ly, and automatically, read by srcml2src.
-c, --interactive
Default is to use buffered output for speed. For interactive ap-
plications output is issued as soon as parsed.
For input from terminal, interactive is default.
-g, --debug
When translation errors occur src2srcml preserves all text, but
may issue incorrect markup. In debug mode the text with the
translation error is marked with a special set of tags with the
prefix err from the namespace https://sdml.info/srcML/srcerr.
Debug mode can also be indicated by defining a prefix for this
namespace URL, e.g., --xmlns:err="https://sdml.info/sr-
cML/srcerr".
-v, --verbose
Conversion and status information to stderr, including encodings
used. Especially useful with for monitoring progress of the the
--files-from option, a directory, or source-code archive (e.g.
tar.gz). The signal SIGUSR1 can be used to toggle this option.
METADATA OPTIONS
This set of options allows control over various metadata stored in the
srcML document.
-l, --language=language
The programming language of the source-code file. Allowable val-
ues are C, C++, C#, Java, or AspectJ. The language affects pars-
ing, the allowed markup, and what is considered a keyword. The
value is also stored individually as an attribute in each unit
element.
If not specified, the programming language is based on the file
extension. If the file extension is not available or not in the
standard list, then the program will terminate.
--register-ext=extention=language
Sets the extensions to associate with a given language. Note:
-d, --directory=directory
The value of the directory attribute is typically obtained from
the path of the input filename. This option allows you to speci-
fy a different directory for standard input or where the direc-
tory is not contained in the input path.
-f, --filename=filename
The value of the filename attribute is typically obtained from
the input filename. This option allows you to specify a differ-
ent filename for standard input or where the filename is not
contained in the input path.
-s, --src-version=version
Sets the value of the attribute version to version. This is a
purely-descriptive attribute, where the value has no interpreta-
tion by src2srcml.
MARKUP EXTENSIONS
Each extensions to the srcML markup has its own namespace. These are
indicated in the srcML document by the declaration of the specific ex-
tension namespace. These flags make it easier to declare.
--literal
Additional markup of literal values using the element literal
with the prefix "lit" in the namespace "https://sdml.info/sr-
cML/literal".
Can also be specified by declaring a prefix for literal names-
pace using the --xmlns option, e.g.,
--xmlns:lit="https://sdml.info/srcML/literal"
--operator
Additional markup of operators values using the element operator
with the prefix "op" in the namespace "https://sdml.info/sr-
cML/operator".
Can also be specified by declaring a prefix for operator names-
pace using the --xmlns option, e.g.,
--xmlns:op="https://sdml.info/srcML/operator"
--modifier
Additional markup of type modifiers using the element modifier
with the prefix "type" in the namespace "http://www.sdml.in-
fo/srcML/modifier".
Can also be specified by declaring a prefix for the modifier
namespace using the --xmlns option, e.g.,
--xmlns:type="https://sdml.info/srcML/modifier"
LINE/COLUMN POSITION
Optional line and column attributes are used to indicate the position
There is a set of standard URIs for the elements in srcML, each with a
predefined prefix. The predefined URIs and prefixes for them include
(given in xmlns notation):
PREFIX URI
(nil) https://sdml.info/sr-
cML/src
cpp https://sdml.info/sr-
cML/cpp
err https://sdml.info/sr-
cML/srcerr
lit https://sdml.info/sr-
cML/literal
op https://sdml.info/sr-
cML/operator
type https://sdml.info/sr-
cML/modifier
pos https://sdml.info/sr-
cML/position
The following options can be used to change the prefixes.
--xmlns=URI
Sets the URI for the default namespace.
--xmlns:PREFIX=URI
Sets the namespace prefix PREFIX for the namespace URI.
These options are an alternative way to turn on options by declaring
the URI for an option. See the MARKUP EXTENSIONS for examples.
CPP MARKUP OPTIONS
This set of options allows control over how preprocessing regions are
handled, i.e., whether parsing and markup occur. In all cases the text
is preserved.
--cpp Turns on parsing and markup of preprocessor statements in non-
C/C++ languages such as Java. Can also be enabled by defining a
prefix for this cpp namespace URL, e.g.,
--xmlns:cpp="https://sdml.info/srcML/cpp".
--cpp-markup-else
Place markup in #else and #elif regions. Default.
--cpp-text-else
Only place text in #else and #elif regions leaving out markup.
--cpp-markup-if0
Place markup in #if 0 regions.
--cpp-text-if0
Only place text in #if 0 regions leaving out markup. Default.
files in srcML archives.
USAGE
To translate the C++ source-code file main.cpp into the srcML file
main.cpp.xml:
src2srcml main.cpp -o main.cpp.xml
To translate a C source-code file main.c into the srcML file
main.c.xml:
src2srcml --language=C main.c -o main.c.xml
To translate a Java source-code file main.java into the srcML file
main.java.xml:
src2srcml --language=Java main.java -o main.java.xml
To specify the directory, filename, and version for an input file from
standard input:
src2srcml --directory=src --filename=main.cpp --version=1 - -o
main.cpp.xml
To translate a source-code file in ISO-8859-1 encoding into a srcML
file with UTF-8 encoding:
src2srcml --src-encoding=ISO-8859-1 --encoding=UTF-8 main.cpp -o
main.cpp.xml
RETURN STATUS
0: Normal
1: Error
2: Problem with input file
3: Unknown option
4: Unknown encoding
6: Invalid language
7: Invalid combination of options
8: Incomplete output due to termination
CAVEATS
Translation is performed based on local information with no symbol ta-
ble. For non-CFG languages, i.e., C/C++, and with macros this may lead
to incorrect markup.
AUTHOR
Written by Michael L. Collard and Huzefa Kagdi
src2srcml 1.0 Wed May 21 10:36:09 EDT 2014 src2srcml(1)
Man(1) output converted with man2html