SharedCCG2 / ccg2lambdatest.sagewsOpen in CoCalc
CoCalc install of ccg2lambda Compositional Semantics System

Example install of ccg2lambda on CoCalc

This worksheet essentially follows the install instructions from the ccg2lambda GitHub repo.

One quick way to use this worksheet is to create a directory CCG2 in your home directory in a CoCalc project and upload this file into that directory:

  1. In the Files view of your project, make sure you are in your home directory by clicking the "Home" icon.
  2. At upper right in the blank for "terminal command", type mkdir ~/CCG2. If the directory already exists, you will get a harmless error message.
  3. Open this file in your browser with this link (you may already be viewing it this way) https://cocalc.com/projects/cd3c25e4-5fbd-439b-9604-6011584af918/files/CCG2/?session=share
  4. Click the Open in CoCalc link at upper right in the shared worksheet tab.
  5. Click Files at upper left in the tab that opens.
  6. Click the checkbox next to the name of this file. A row of buttons will appear above the list of files.
  7. Click Copy, then choose a different project for the destination. Select your project under In the project and select CCG2 under Destination.
  8. Click Copy 1 item.

There is a screenshot file, premise0.png, that is also shared, although it is not essential. You can also copy that using the last 3 steps above.

From the README file of ccg2lambda:

This is a tool to derive formal semantic representations of natural language sentences given CCG derivation trees and semantic templates.

NOTE: The MathJax CDN url mentioned in below has been fixed upstream. Therefore, the text substitution is commented out.

%auto
%default_mode sh
CC_DIR=~/CCG2
pwd
/home/user/CCG2/ccg2lambda
# clone ccg2lambda git repo
cd $CC_DIR
ccrepo_url="https://github.com/mynlp/ccg2lambda.git"
ccrepo=`basename $ccrepo_url .git`
test -d $ccrepo || {
  echo cloning $ccrepo from GitHub
  git clone -q $ccrepo_url
}

ls -ld $ccrepo
echo running basic tests - expect a few failures
(cd $ccrepo
python3 scripts/run_tests.py 2>&1 | egrep "^(Ran|FAILED)"
echo compile the coq library that contains the axioms
rm -f coqlib.glob coqlib.vo
/usr/bin/coqc coqlib.v
ls -l coq*
)
drwxr-xr-x 7 user user 22 May 14 06:58 ccg2lambda
running basic tests - expect a few failures
Ran 162 tests in 2.271s
compile the coq library that contains the axioms Identifier 'most' now a keyword -rw-r--r-- 1 user user 8706 May 14 07:01 coqlib.glob -rw-r--r-- 1 user user 6478 May 11 11:49 coqlib.v -rw-r--r-- 1 user user 20206 May 14 07:01 coqlib.vo
# download C & C parser and models
parser_url="http://www.cl.cam.ac.uk/~sc609/resources/candc-downloads/candc-linux-1.00.tgz"
parser=`basename $parser_url`
test -f $parser || {
  echo downloading parser $parser
  curl -sO $parser_url
}

models_url="http://www.cl.cam.ac.uk/~sc609/resources/candc-downloads/models-1.02.tgz"
models=`basename $models_url`
test -f $models || {
  echo downloading models $models
  curl -sO $models_url
}

ls -l $parser $models


-rw-r--r-- 1 user user 12095247 May 11 11:39 candc-linux-1.00.tgz
-rw-r--r-- 1 user user 52304957 May 11 11:41 models-1.02.tgz
# extract parser and models
cd $CC_DIR
pwd
parserb=`basename $parser .tgz`
tar -xzvf $parser > p.out
head -1 p.out
# candc-1.00 without -linux infix
CAC_TOP=`head -1 p.out| sed -e 's|/.*||'`
echo CAC_TOP $CAC_TOP
CAC_DIR=$CC_DIR/$CAC_TOP
echo CAC_DIR $CAC_DIR
tar -C $CAC_DIR -xzf $models
ls $CAC_DIR

/home/user/CCG2 candc-1.00/bin/ CAC_TOP candc-1.00 CAC_DIR /home/user/CCG2/candc-1.00
bin  models
# create sentences file
sfile=$ccrepo/sentences.txt
test -f $sfile || {
  echo creating file $sfile
  cat > $sfile <<_EOF
All women ordered coffee or tea.
Some woman did not order coffee.
Some woman ordered tea.
_EOF
}

ls -l $sfile
-rw-r--r-- 1 user user 90 May 11 11:59 ccg2lambda/sentences.txt
# tokenize
cd $CC_DIR/$ccrepo
sfileb=`basename $sfile`
tfile="sentences.tok"
sed -f en/tokenizer.sed > $tfile < $sfileb
cat $tfile
All women ordered coffee or tea . Some woman did not order coffee . Some woman ordered tea .
# create xml and convert C&C xml to Jigg xml
cd $CC_DIR/$ccrepo
$CAC_DIR/bin/candc --models $CAC_DIR/models --candc-printer xml --input sentences.tok > sentences.candc.xml
python3 en/candc2transccg.py sentences.candc.xml > sentences.xml
# this file was generated by the following command(s): # /home/user/CCG2/candc-1.00/bin/candc --models /home/user/CCG2/candc-1.00/models --candc-printer xml --input sentences.tok 1 parsed at B=0.075, K=20 1 coverage 100% 1 stats 4.54329 232 296 2 parsed at B=0.075, K=20 2 coverage 100% 2 stats 3.61092 99 119 3 parsed at B=0.075, K=20 3 coverage 100% 3 stats 3.09104 93 110 Produced 3 transccg trees
# obtain the semantic representations
cd $CC_DIR/$ccrepo
python3 scripts/semparse.py sentences.xml en/semantic_templates_en_emnlp2015.yaml sentences.sem.xml
head -8 sentences.sem.xml
<?xml version='1.0' encoding='utf-8'?> <root> <document> <sentences> <sentence> <tokens> <token start="0" span="1" pos="DT" chunk="I-NP" entity="O" cat="NP[nb]/N" id="t0_0" surf="All" base="all"/> <token start="1" span="1" pos="NNS" chunk="I-NP" entity="O" cat="N" id="t0_1" surf="women" base="woman"/>
# build a theorem and prove it
cd $CC_DIR/$ccrepo
python3 scripts/prove.py sentences.sem.xml --graph_out graphdebug.html
# fix mathjax config so google chrome will show mathml output
#x="cdn.mathjax.org/mathjax/latest"
#y="cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1"
#sed -i -e "s|$x|$y|" -e "s|http|https|" graphdebug.html
grep Jax graphdebug.html
yes
              src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
%sage
# you may have to right click and open link in a new window to see the diagram
url = "https://cocalc.com/"+salvus.project_info()['project_id']+"/raw/CCG2/ccg2lambda/graphdebug.html"
my_html = '<a href="{}" target="_blank">graphical representation of CCG trees</a>'.format(url)
salvus.html(my_html)
graphical representation of CCG trees
%sage
# here is a screenshot of the start of the diagram above
salvus.file("premise0.png")