FcaStone: software for FCA file format conversion and interoperability

Contents of this document:

Introduction

FcaStone (named in analogy to "Rosetta Stone") is a command-line utility that converts between the file formats of commonly-used FCA tools (such as
ToscanaJ, ConExp, Galicia and Colibri) and between FCA formats and other graphics formats. The main purpose of FcaStone is to improve the interoperability between FCA, graph editing and vector graphics software.

FcaStone can

Cautions:

These features are not yet available, but are planned for future releases: back to table of contents

FcaStone and FcaFlint

The
FcaFlint executable is bundled with FcaStone. FcaFlint implements Relation Algebra operations for formal contexts. Both programs can be installed and run independently. But if FcaFlint is used with contexts other than "cxt" format, it needs to be able to call FcaStone. In that case, the installation advice on the FcaFlint website should be followed.

back to table of contents


Examples of lattices generated with the software

Examples of files to be used with FcaStone and of lattices generated using FcaStone and Graphviz's layouts can be found
here.

back to table of contents


Downloading the software

Please, note that this program is distributed under the
GNU General Public License v3.

FcaStone can be downloaded from GitHub (including the fcastone and fcaflint executable scripts, the licence and readme files).

back to table of contents


Installation on Unix, Linux, Mac OS X

1) Note: FcaStone requires the Perl programming language to be installed. Perl is usually pre-installed in standard Linux and Mac OS X distributions. If Perl is not installed, it is available at
http://www.perl.com/. For some of the lattice conversion features, Graphviz is required (see below).

2a) If you have administrator privileges, you should copy the file to a shared bin directory (eg /usr/local/bin) and change the file permission to executable:

sudo mv fcastone /usr/local/bin
sudo chmod 755 /usr/local/bin/fcastone

b) If you do not have administrator privileges, then you should move the file to a bin directory that you have access to and that is in your path (or create a bin directory under your home directory if it does not already exist).

mkdir ~/bin
mv fcastone ~/bin
chmod 755 ~/bin/fcastone

If this bin directory is not in your path, then you need to add it to your PATH variable (see here for further information on changing the Unix PATH variable).

3) If you are getting a "command not found" error right after installation, you'll need to run something like "rehash" or, alternatively, you could logout/login at this point. Also, check the location of perl (for example by typing "which perl"). Perl is expected to be installed in /usr/bin. If necessary, edit the first line of the fcastone executable so that it contains the correct location of your perl installation.

Installation on Windows

1) Note: FcaStone requires the Perl programming language to be installed. It is not difficult to install and run this software on a PC, but Windows users may not be used to command-line programs.

Information about installing Perl can be found here. I have tested installing ActivePerl, which is freely available and can be downloaded as an msi file. The default installation sets the PATH variable so that it can find the Perl executable. This website contains a short tutorial on installing ActivePerl and on how to use the DOS prompt.

2) In order to avoid having to type long path names, it is probably easiest to move the fcastone file into a directory which also contains the FCA files that are to be used. The fcastone file can be changed to executable format, or it can be executed by preceding it with perl, i.e., "perl fcastone ...". It might be useful to rename fcastone to fcastone.pl, so that Windows knows the filetype.

Linux, Mac OS X and Windows: Installation of Graphviz

For some of the lattice conversions, FcaStone requires AT&T's Graphviz program "dot" to be installed. Installation on Mac OS X, Windows and several Linux flavours is easy because precompiled versions of Graphviz are available and can be downloaded from the Graphviz website. (It is advisable to install a recent version. Some package archives contain very old versions of Graphviz, which run much slower.)

The table below shows which features require Graphviz and which don't.

Note for use on a webserver: unfortunately, it may be impossible to install Graphviz on a webserver on a commercially hosted account (unless you can find a statically compiled version of the dot executable which can be copied to the webserver), because service providers don't usually give out administrator passwords. It should be possible, however, to install Graphviz on a virtual private server (because such accounts come with administrator privileges).

FcaStone passes the data straight to Graphviz. This means that only those output file formats are available which are supported by the local Graphviz installation. If the labels in the lattice diagrams are not readable, then there is a problem with Graphviz's ability to find the local fonts (see Graphviz's documentation). If one of the formats does not work, try another (eg. gif instead of jpg). The file fcaStoneDotErrors.log shows the errors and warnings produced by Graphviz. The name of this file can be changed or it can be suppressed by editing the line that starts with "my $errorslocation = " in fcastone.

If Graphviz is installed in a location that is not in the main Shell path, it is useful to uncomment the line in fcastone that starts with "# $dotlocation = " and edit it to point to the location of dot.

back to table of contents


Supported Formats

extensionI/OtypescopeGraphviz
required?
comments
cxtinput/outputFCA format only contextnoP. Burmeister's format
con Colibri format
slf Galicia format
bin.xml Galicia format
tuplestab separated values Tupleware format
(like csv, but tab instead of comma
+ additional first line)
only two column files supported
htmloutput onlyhtml tablecan be used with scripts
csvinput/output comma separated valuesused by databases/spreadsheets
cscFCA format context + lattice F. Vogt's Anaconda format
(lattice not implemented)
cexno ConExp, lattice not yet implemented
csxno ToscanaJ, lattice not yet implemented
figoutput only vector graphicsyes for lattice xfig
texlatexto be used with B. Ganter's fca.sty,
lattice not yet implemented
dotgraph format only latticeno Graphviz format
gmlno
gxlyesformat availability depends
on local Graphviz installation
svgvector graphics
jpgraster graphics
gif
png
pspage description format
pdf

Other formats:

Many tools for image conversion exist elsewhere. Depending on the local Graphviz installation, more options may be available (eg. running "dot -Tdia -o outputfilename inputFileInDotFormat" on Linux will produce output suitable for the Dia program). Programs such as ps2eps, fig2dev, ImageMagick's "convert", etc provide further options for file conversions.

back to table of contents


How to format the labels in the lattice diagrams

It is difficult to represent FCA lattices in non-FCA graph formats because concept lattices use up to two labels per node (objects and attributes) which is unusual compared to other graph applications. Since there is no graph format that is universally accepted as a standard by a variety of tools, it is furthermore difficult to decide which graph formats to convert to.

The default in FcaStone is to concatenate the labels belonging to the same node into one string which is then cut off after 30 characters. If the -c option is used, the objects and attributes are "clarified", i.e., only at most one object and attribute of each node is represented. The -t option places the labels of each node on top of each other. This is only useful if the output file is of type fig or svg and the file is afterwards edited in a vector graphics editor.

Two lattice designs are available (see some examples). Using the -b option, each node is represented as a box. The objects are listed in the bottom half, the attributes in the top half. The advantage of this format is that the labels never overlap because Graphviz will adjust the box sizes depending on the label sizes. The disadvantage is that this is not the standard FCA way of representing lattices. The other design (which is default) is more similar to traditional FCA lattices, with the disadvantage that labels can overlap. The -t option only applies to the second design. These design choices are based on what can be stored in the dot format. The gml output of FcaStone just places the labels across the nodes, which is not satisfactory. Therefore, gml output should only be used if the lattice is afterwards manually edited.

In general, automatically generated lattices will probably never be as perfect as hand drawn ones. If perfect lattice pictures are desired, then traditional FCA tools should be used. FcaStone's facility for lattice generation is more aimed at applications in which the lattices cannot be hand drawn (such as automatically generated ones on webpages) and don't need to be perfect (for example, because they are just used to provide a rough sketch of what the lattice looks like).

back to table of contents


Running the command/man page

NAME
fcastone
SYNOPSIS
fcastone [-bBcgijmnNOprstuUw] inputfile outputfile
fcastone -l
DESCRIPTION
The program determines the type of the format conversion by looking at the file extensions of the input and output files. If the files have incorrect extensions or are not correctly formated with respect to how FCA files with such extensions are usually formated, then the conversion will not work. The input and output files can be provided as PATH names, but the PATHs cannot contain any dots. Basically, everything after the first dot is considered the extension of the file.

At the moment only files which contain one (binary) formal context and/or one lattice diagram can be processed.

The following options are available:
-b Box format: the nodes in the lattice are displayed as boxes, which contain the objects and attributes.
-B another Box format: the nodes in the lattice are displayed as boxes, which contain the objects and attributes. Attributes are separated by commas and the list of attributes is truncated if it is too long. Objects are separated by space and not truncated. This format is used by the -j option (where the objects are image files).
-c Clarify: the lattice diagram is clarified, which means that if more than one object or attribute belong to the same node, only one is printed.
-g Graphviz: the lattice layout will be calculated with Graphviz (which means that Graphviz must be installed). It is not necessary to use -g with svg, jpg, gif, png, ps and pdf files because these imply using Graphviz. But for cex, csx, fig, and tex files using -g ensures that the lattice, not the context is generated (although this is only implemented for fig at the moment). In case of dot files, the -g option results in the coordinates being included in the file.
-i Imagemap: adds clickable links of the format "script?attr=...+obj=..." to the nodes. This is only useful on a webserver with svg files or html imagemaps. The script that responds to the request needs to be installed in the same directory and must have the same name as the output file without extension. For example, if a script calls "fcastone -w -i format.cxt answer.svg", then the responding script must be called "answer". Because of the use of "-w", no output file "answer.svg" is produced (see the examples below). The "-i" option implies "-b", unless used with "-B".
-j image files, such as Jpg, png, gif, etc: instead of names of objects. It is advisable to use thumbnail-sized images. In svg output, clicking on a thumbnail will open the image in full size. The names of the image files should be provided as "path_to_thumbnail|path_to_fullimage" or as "path_to_thumbnail" (if there is no full-sized image). Svg output files link to the images but do not include them, therefore the paths to the files need to be correct while FcaStone is running and while the svg files are viewed. For non-svg output formats it may be necessary to install extra Graphviz plugins. The "-j" option implies "-B".
-l Licence: running "fcastone -l" without any other arguments prints the licence and warranty disclaimer.
-m Microsoft: prints the output file with Microsoft style line breaks. This is only relevant for non-XML text files.
-n No input file: reads from STDIN instead of a file. The command-line should contain something like "format.cxt" instead of the name of the input file to indicate the type of the input data. The "-n" option implies "-s".
-N No output file: writes to STDOUT instead of a file. The command-line should contain something like "format.cxt" instead of the name of the output file to indicate the type of the output data. The "-N" option implies "-s".
-O One line input/output: this option is only needed if FcaStone is called from a program using bidirectional communication which can only send/receive one line at a time. The internal linebreaks of non-XML files should be replaced with the character sequence '|%|%' for this purpose. For XML files, linebreaks should simply be removed. The "-O" option implies "-s".
-p Pipe: only relevant for csv files. Use a pipe (|) instead of a comma as delimiter. This option should be used if the data contains commata. (Note: tab delimited files are provided by the .tuples extension.)
-r Rotating: only relevant for tex files. Ensures that the attributes are rotated sideways in the formal context.
-s Silent mode: suppresses warnings about output files being overwritten if files with such names already exist.
-t on Top: labels belonging to the same node are printed on top of each other. This is only useful for fig and svg files which are edited manually in a vector graphics editor.
-u Unicode: tells FcaStone explicitly that the incoming data is in UTF-8 format. In most cases, this option will not make a difference. This option should not be used together with "-U".
-U Unicode: converts incoming UTF-8 data into numeric character references. This option should be used if non-ASCII characters do not display properly and the output file is an XML file (html, svg, cex, csx, ...). The input file should be in UTF-8 encoding.
-w Web mode: this mode reads the input from STDIN, writes the output to STDOUT and replaces "\n" with <br>. In this manner, fcastone can be called from a server-side script that generates a graphics file of a concept lattice to be displayed on a webpage. See the examples below. The "-w" option implies "-snN".

EXAMPLES
fcastone waters.cxt waters.cex
fcastone -g waters.cxt waters.fig

Use on a website (Perl, explanation below):

#!/usr/local/bin/perl
use CGI;
print "Content-type: image/gif\n\n";
$fcacontent = "object1,attr1\n object1,attr2\n";
open (PIPE, "| fcastone -w -b format.csv format.gif");
print PIPE $fcacontent;
close PIPE;
Use on a website (PHP, explanation below):
<?php
header("Content-type: text/html");
$fcacontent = "object1,attr1\n object1,attr2\n";
$descriptorspec = array(0 => array("pipe", "r"),1 => array("pipe", "w") );
$process = proc_open('/usr/local/bin/fcastone -w -b format.csv format.cxt',$descriptors\
pec,$pipes);

if (is_resource($process)) {

fwrite($pipes[0],  "$fcacontent");
fclose($pipes[0]);
while (! feof($pipes[1])) {$retval .= fgets($pipes[1]);}
echo $retval;
fclose($pipes[1]);
$return_value = proc_close($process);
}
?>
In these examples, format.csv and format.gif are just strings that indicate which formats are used for input and output; they are not filenames. The variable $fcacontent contains a context in csv format. SECURITY WARNING: any form variables that are piped into a program must be carefully checked for insecure data because otherwise a hacker might be able to execute any command on the server. For further explanations of this risk, do a websearch on "user submitted data" "security".

Also, see the troubleshooting comments below.

FILES
fcaStoneDotErrors.log
(This file is created in the current directory and contains any errors or warnings produced by running Graphviz's dot.)
BUGS
This version is an alpha release of the software. I am sure there are still many bugs in the software. If you find a bug, please, send me an email (see http://www.upriss.org.uk for my current email address). Please, include the following in your email: 1) the command-line that you used, 2) the name of the operating system of your computer, 3) a short description of the problem, 4) the input file as an attachment.
AUTHOR
This program was written by Uta Priss (http://www.upriss.org.uk). More information on FCA can be found at (http://www.upriss.org.uk/fca/).

The algorithm for calculating the concepts is an implementation of Bernhard Ganter's algorithm described in "Two basic algorithms in concept analysis." Technische Hochschule Darmstadt, FB4-Preprint, 831, 1984. (It's slow because it's implemented using strings - the algorithm itself is fine.)

back to table of contents

More details on using FcaStone with other tools

Text formatting

  • Latex:
    Latex is a document markup and preparation system. The FCA latex output is to be used with
    B. Ganter's fca.sty. Only contexts (not lattices) can currently be generated.

    Graph layout and editors

  • Graphviz:
    Graphviz is an open source (CPL licence) graph layout program with dot as its native format. It provides a variety of file conversion options. FcaStone calls it in order to produce the lattice layouts. It comes with the XWindows program "dotty", which can be used to edit the lattices (stored in dot format). A list of other tools that can be used with dot files, is available on the Graphviz website.
  • yEd:
    yEd is a closed source, proprietary, but free to download, Java-based graph editor with GraphML as its native format. It is easy to install on Windows (exe file), Mac OS (dmg file) and Unix/Linux (sh file). It can import gml files. yEd has its own graph layout functionality. FcaStone can produce gml files without Graphviz being installed (but see the comment about the labels in gml files).
  • jGraph:
    jgraph is an open source, proprietary, but free to download, Java-based graph editor. Presumably it can read gxl files, but I have not tested this.

    Vector graphics editors

  • Xfig/WinFig/jfig:
    Xfig is a Unix (Linux, Mac OS X) vector graphics editor with fig as its native format. For installation: there is a tar.gz file on the website. On Mac OS, a binary can be installed via Fink. On Linux, it should be available through the usual package management systems. WinFig is a version that runs on PCs; jfig is platform independent.
    Without the "-g" option, FcaStone produces a fig file of the context. With the "-g" option, the lattice is produced.
  • Inkscape:
    Inkscape is a an open-source, vector graphics editor with svg as its native format. It can be downloaded and installed via sourceforge as binary versions for Linux (package), Mac OS X (dmg), and Windows (exe). Lattices in svg format can be uploaded into Inkscape. Inkscape has a connector facility which would make it possible to edit graphs so that moving a node also moves the connected edges. Unfortunately, the connections are stored using special Inkscape-only xml tags, which do not correspond to the svg files that are generated by Graphviz.
  • Dia:
    Dia is a GTK+ based diagram creation program for Linux, Unix and Windows released under the GPL license. It is pre-installed on many Linux distributions. A Windows executable is available. On other Unix and Mac OS X, it has to be installed from source. Graphviz can convert dot files into dia files, if the required libraries are installed (using "dot -Tdia -o outputfilename inputFileInDotFormat").

    back to table of contents


    Troubleshooting/FAQ

    back to table of contents


    Copyright 2009 Uta Priss.
    www.upriss.org.uk