Max Planck Institute for the History of Science

[This software is dedicated to Dr. Malcolm Hyman]

[It is based on Donatus and Pollux]

Max Planck Institute for the History of Science - Language technology services

Url: /mpiwg-mpdl-cms-web/lt/GetDictionaryEntries
- Request parameters
  - query (required)
    - by one form or lemma (e.g. "revolution")
    - by a list of forms or lemmas (e.g. "revolution equality brotherliness")
    - by a prefix range: entries starting with a prefix (e.g. "a*")
  - queryDisplay (optional)
    - display of the query
    - default: content of parameter "query"
  - inputType (optional)
    - "form"
    - "lemma"
    - default: "form"
  - language (optional)
    - ISO 639-3 specifier
    - default: "eng"
  - dictionary (optional)
    - dictionary name, e.g. "webster"
    - default: "all" (all dictionaries for the specified language)
  - outputType (optional)
    - this parameter can occur many times (e.g. "outputType=morphCompact&outputType=dictCompact")
      - "morphCompact"
      - "dictCompact"
      - "wikiCompact"
      - "allCompact" (all output types compact)
      - "morphFull"
      - "dictFull"
      - "wikiFull"
      - "allFull" (all output types full)
    - default: "allCompact"
  - outputFormat (optional)
    - "html"
    - "xml"
    - default: "xml"
  - normalization (optional)
    - "none"
    - "norm"
    - default: "norm"
  - resultPageNumber (optional)
    - works only for range queries
    - page number of the result (e.g. "2": result entries from position 51 to 100)
    - default: "1"
  - resultPageSize (optional)
    - works only for range queries
    - page size of the result (e.g. "100": each result page has a size of 100)
    - default: "50"
- Response output
  - dependent of outputFormat, outputType and resultPage: morphology, dictionary and Wikipedia entries in Xml or Html format
  - Example: query=a*&language=lat&outputFormat=html
  - Example: query=a*&dictionary=ls
  - Example: query=revolution&language=lat
  - Example: query=multa&language=lat&outputFormat=html&outputType=allCompact
Url: /mpiwg-mpdl-cms-web/lt/GetLemmas
- Request parameters
  - query (required)
    - one form or lemma (e.g. "revolution") or
    - blank separated list of forms or lemmas (e.g. "revolution equality brotherliness")
  - inputType (optional)
    - "form"
    - "lemma"
    - default: "form"
  - language (optional)
    - ISO 639-3 specifier
    - default: "eng"
  - outputType (optional)
    - "compact"
    - "full"
    - default: "compact"
  - outputFormat (optional)
    - "html"
    - "xml"
    - "string" (lemma names separated by a blank)
    - default: "xml"
  - normalization (optional)
    - "none"
    - "norm"
    - default: "norm"
- Response output
  - dependent of outputFormat and outputType: lemma entries in Xml or Html or string format
  - Example: query=multa&language=lat&outputFormat=html
Url: /mpiwg-mpdl-cms-web/lt/GetForms
- Request parameters
  - query (required)
    - one lemma (e.g. "revolution") or
    - blank separated list of forms (e.g. "revolution equality brotherliness")
  - language (optional)
    - ISO 639-3 specifier
    - default: "eng"
  - outputType (optional)
    - "compact"
    - "full"
    - default: "compact"
  - outputFormat (optional)
    - "html"
    - "xml"
    - "string" (lemma names separated by a blank)
    - default: "xml"
  - normalization (optional)
    - "none"
    - "norm"
    - default: "norm"
- Response output
  - dependent of outputFormat and outputType: form entries in Xml or Html or string format
  - Example: query=edo sum&language=lat&outputFormat=string
Url: /mpiwg-mpdl-cms-web/text/Tokenize
- Request parameters
  - inputString or srcUrl (required)
    - inputString
      - string which should be tokenized
        
        unstructured text
        
        XML fragment/document
    - srcUrl
      - source URL
        
        unstructured text
        
        XML fragment/document
  - language (optional)
    - ISO 639-3 specifier
    - if input is XML and an element contains the attribute "xml:lang" this value is used for this element
    - default: "eng"
  - normalization (optional)
    - "none" (no normalization)
    - "reg" (regularized)
    - "norm" (regularized + normalized)
    - default: "norm"
  - normalizationType (optional)
    - "dictionary"
    - "display"
    - default: "dictionary"
  - elements (optional)
    - list of xml element names which should be tokenized (e.g. "s head")
    - default: empty list (which means: all elements are tokenized)
  - stopElements (optional)
    - list of xml element names which are stop elements(e.g. "var emph"): stop elements: its tokens should not get word tags (when output format is "xml") or its tokens should be removed (if output format is "string")
    - default: empty list
  - highlightTerms (optional)
    - list of word forms which should be highlighted. Each matched word form is surrounded by <hi></hi>. The matching function is dependent of the normalization. E.g. if normalization = "norm" then the normalized word form is fetched and highlighted.
    - default: empty list
  - outputFormat (optional)
    - "xml"
    - "string"
    - default: "xml"
  - outputOptions (optional)
    - output options separated by blanks (e.g. "withForms withLemmas")
      - "withForms"
      - "withLemmas"
      - default: empty list
- Response output
  - outputFormat=xml
    - tokenized inputString or document (enriched by element <w>)
      - Example: <s><w lang="deu" form="dies" formRegularized="dies" formNormalized="dies" forms="dies, dieser, dieses, diesen" lemmas="dieser">Dies</w> <w lang="deu" form="ist" formRegularized="ist" formNormalized="ist" forms="bin, bist, ist, seid, sind, sein, war, warst, wart" lemmas="sein">ist</w> <w lang="deu" form="ein" formRegularized="ein" formNormalized="ein" forms="ein, eines, einer" lemmas="ein">ein</w> <w lang="deu" form="satz" formRegularized="satz" formNormalized="satz" forms="satz, sätze, satzes" lemmas="satz">Satz</w></s>
  - outputFormat=string
    - word tokens of inputString or document (separated by Blank)
  - Example: inputString=edo sum philoſophi&language=lat&outputFormat=xml
  - Example: language=lat&srcUrl=http://mpdl-system.mpiwg-berlin.mpg.de/mpdl/page-query-result.xql?document=/echo/la/Benedetti_1585.xml%26mode=pureXml%26pn=13
  - Example: language=lat&highlightTerms=eorumque&srcUrl=http://mpdl-system.mpiwg-berlin.mpg.de/mpdl/page-query-result.xql?document=/echo/la/Benedetti_1585.xml%26mode=pureXml%26pn=13
  - Example: language=lat&outputOptions=withForms withLemmas&srcUrl=http://mpdl-system.mpiwg-berlin.mpg.de/mpdl/page-query-result.xql?document=/echo/la/Benedetti_1585.xml%26mode=pureXml%26pn=13
  - Example: language=lat&outputFormat=string&normalization=orig&srcUrl=http://mpdl-system.mpiwg-berlin.mpg.de/mpdl/page-query-result.xql?document=/echo/la/Benedetti_1585.xml%26mode=pureXml%26pn=13
  - Example: language=lat&outputFormat=string&outputOptions=withLemmas&srcUrl=http://mpdl-system.mpiwg-berlin.mpg.de/mpdl/page-query-result.xql?document=/echo/la/Benedetti_1585.xml%26mode=pureXml%26pn=13
Url: /mpiwg-mpdl-cms-web/text/Normalize
- Request parameters
  - inputString (required)
    - string which should be normalized
  - language (optional)
    - ISO 639-3 specifier
    - default: "eng"
  - type (optional)
    - "dictionary"
    - "display"
    - default: "display"
- Response output
  - normalized string
  - Example: inputString=philoſophi&language=lat
Url: /mpiwg-mpdl-cms-web/text/Transcode
- Request parameters
  - inputString (required)
    - string which should be transcoded
  - srcEncoding (required)
    - "betacode"
    - "buckwalter"
    - "unicode"
  - destEncoding (optional)
    - "betacode"
    - "buckwalter"
    - "unicode"
    - default: "unicode"
- Response output
  - transcoded string
  - Example: inputString=kai/&srcEncoding=betacode&destEncoding=unicode