eKorpus

Data model

Here are described the main entities of eKorpus data model. Clickable map:

Data model

Users

User

For each system user (administrator, speaker or testee) a record is created. The database contains only anonymous information.

CREATE TABLE "user" (
  id serial PRIMARY KEY,
  name character varying(40) NOT NULL,
  roles character varying(256),
  "password" character varying(40),
  gender character varying(1),
  age integer,
  education character varying(8),
  nationality character varying(8),
  "language" character varying(2),
  language2 character varying(2),
  delay timestamp without time zone,
  blocked integer,
  ignore integer,
  empathy integer,
  experience integer
);

The fields are:

id:

autogenerated unique key

name:

user name. It is used to login to eKorpus system.

roles:

only role used is admin. Users with admin role have full access to eKorpus.

password:

secret password. It is used to login to eKorpus system.

gender:

0 - male or 1 - female

age:

in years. use birth date

education:

Following values are used:

alg:basic
pohi:primary
kesk:secondary
keri:professional
korg:higher
tead:acad. degree
- -:other
nationality:

Following values are used: et, lv, ru, fi, –, ...

language:

mother tongue. et, lv, ru, se, fi, – ...

language2:

education language

delay:

@TODO unused?

blocked:

user is blocked from logging into system

ignore:

ignore user test results in reports

empathy:

empathy test score

experience:

general work experience in years

Todo

user.age replace with birth date.

Todo

user.delay hm, remove unused.

Groups

User groups are defined in table groups.

CREATE TABLE groups (
  id serial PRIMARY KEY,
  text character varying(256),
  code character varying(8)
);

The fields are:

id:autogenerated unique key
text:displayname.
code:code, identifies group in html, etc

User groups

Users are assigned to groups in table usergroups.

CREATE TABLE usergroups (
  user_id integer NOT NULL REFERENCES "user"(id) ON DELETE CASCADE,
  groups_id integer NOT NULL REFERENCES groups(id) ON DELETE CASCADE,
);

Recordings

The recordings are stored as WAV files. Database contains only metainformation:

  • recording information
  • segmentation layers
  • segments of each segmentation layer

Recording

CREATE TABLE recording (
  id serial PRIMARY KEY,
  "comment" character varying(256),
  user_id integer REFERENCES "user"(id),
  rec_date timestamp without time zone,
  file character varying(256)
);

The fields are:

id:autogenerated unique key
comment:free text helping to identify the recording
user_id:reference to speaker record in User table
rec_date:recording date
file:file name. This is a relative path.

Segmentation

Each recording can have multiple segmentation layers.

CREATE TABLE segmentation (
  id serial PRIMARY KEY,
  recording_id integer NOT NULL REFERENCES recording(id),
  "type" character varying(32),
  "comment" character varying(256)
);

The fields are:

id:autogenerated unique key
recording_id:reference to the Recording table
type:segmentation type: sentence, word, phoneme, ..
comment:free text helping to identify the segmentation

Segment

Each segmentation consists of multiple segments.

CREATE TABLE segment (
  id serial PRIMARY KEY,
  segmentation_id integer NOT NULL REFERENCES segmentation(id) ON DELETE CASCADE,
  label character varying(4000),
  "start" integer NOT NULL,
  "end" integer NOT NULL,
  jnr integer DEFAULT 0
);

The fields are:

id:autogenerated unique key
segmentation_id:
 reference to the Segmentation table
label:segmentation type specific markup
start:segment start time relative to recording start
end:segment end time relative to recording start
jnr:position number in parent segmentation unit.

Test definitions

Tests are defined as a sequence of multiple choice items (questions). There are two types of tests:

  • perception tests
  • generic multiple choice tests

All perception test items have same item question and options list. They also have an associated sound segment.

Test

CREATE TABLE test (
  id serial PRIMARY KEY,
  selection_id integer REFERENCES selection(id),
  text character varying(256),
  "comment" character varying(256),
  instruction character varying(4000),
  lang character varying(2),
  "type" character varying(8)
);

The fields are:

id:autogenerated unique key
selection_id:reference to the selection table, the default options list
text:item question for perception tests
comment:free text helping to identify the test
instruction:test instruction
lang:test language
type:perception or generic

Item

Each test consists of multiple items.

CREATE TABLE item (
  id serial PRIMARY KEY,
  test_id integer NOT NULL REFERENCES test(id) ON UPDATE CASCADE ON DELETE CASCADE,
  selection_id integer REFERENCES selection(id),
  nr integer,
  text character varying(256),
  first_id integer REFERENCES segment(id),
  last_id integer REFERENCES segment(id),
  voiced_item_id integer REFERENCES item(id)
);

The fields are:

id:autogenerated unique key
test_id:reference to the test table (parent)
selection_id:reference to the selection table (children)
nr:item sequence nr
text:item question text
first_id:start segment id for perception tests
last_id:end segment id for perception tests
voiced_item_id:reading tests have here listening test item id.

Selection

Each item has an options list which is identified by a record in selection table.

CREATE TABLE selection (
  id serial PRIMARY KEY,
  "comment" character varying(256),
  code character varying(8),
  lang character varying(2)
);

The fields are:

id:autogenerated unique key
comment:free text helping to identify the options list
code:unique id for data exports
lang:test language

Choice

Options belonging to an options list are defined in the choice table.

CREATE TABLE choice (
  id serial PRIMARY KEY,
  selection_id integer NOT NULL REFERENCES selection(id) ON UPDATE CASCADE ON DELETE CASCADE,
  nr integer,
  text character varying(256),
  code character varying(8),
  english character varying(256)
);

The fields are:

id:autogenerated unique key
selection_id:reference to the selection table, parent
nr:choice sequence nr
text:choice text
code:unique id for data exports
english:translation?

Testuser

CREATE TABLE testuser (
  test_id integer REFERENCES test(id) ON DELETE CASCADE,
  user_id integer REFERENCES "user"(id) ON DELETE CASCADE
);

Responsekey

Generic tests can have a response key for calculating test scores.

CREATE TABLE responsekey (
  item_id integer REFERENCES item(id) ON UPDATE CASCADE ON DELETE CASCADE,
  nr integer,
  code character varying(8),
  value integer
);

The fields are:

id:autogenerated unique key
item_id:reference to the item table
nr:choice sequence nr
code:unique id for data exports
value:score value

Test sessions

Test are executed in user sessions.

Session

CREATE TABLE "session" (
  id serial PRIMARY KEY,
  user_id integer NOT NULL REFERENCES "user"(id) ON UPDATE CASCADE ON DELETE CASCADE,
  test_id integer NOT NULL REFERENCES test(id) ON UPDATE CASCADE ON DELETE CASCADE,
  score character varying(256),
  ts timestamp without time zone
);

The fields are:

id:autogenerated unique key
user_id:reference to the user table
test_id:reference to the test table
score:score value
ts:test session start time

Response

For each test item a user response is stored.

CREATE TABLE response (
  id serial PRIMARY KEY,
  session_id integer NOT NULL REFERENCES "session"(id) ON UPDATE CASCADE ON DELETE CASCADE;,
  item_id integer NOT NULL REFERENCES item(id) ON UPDATE CASCADE ON DELETE CASCADE,
  code character varying(8),
  ts timestamp without time zone
);

The fields are:

id:autogenerated unique key
session_id:reference to the session table, parent
item_id:reference to the item table
code:choice code
ts:timestamp

Supporting entities

In order to optimize the reports performance for each segment participating in perception tests the response scores are calculated and stored.

CREATE TABLE segmentscore (
  segment_id integer NOT NULL REFERENCES segment(id) ON DELETE CASCADE,
  code character varying(8) NOT NULL,
  value integer,
  text_code character varying(8) NOT NULL,
  text_value integer
);

The fields are:

segment_id:reference to the segment table, parent
code:choice code
value:
text_code:
text_value:
CREATE TABLE profile (
  id serial PRIMARY KEY,
  name character varying(40) NOT NULL,
  gender character varying(1),
  age integer,
  education real,
  arevus real,
  viha real,
  masendus real,
  ujedus real,
  liialdamine real,
  abitus real,
  sobralikkus real,
  seltsivus real,
  kehtestavus real,
  aktiivsus real,
  elamustejanu real,
  roomsameelsus real,
  kujutlusvoime real,
  kunstilisedhuvid real,
  tundelisus real,
  vahelduseiha real,
  intellektuaalsus real,
  avatusvaartustele real,
  usaldus real,
  kolbelisus real,
  omakasupyydmatus real,
  leplikkus real,
  tagasihoidlikkus real,
  kaastundlikkus real,
  enesetohusus real,
  korralikkus real,
  kohusetundlikkus real,
  saavutusvajadus real,
  enesedistsipliin real,
  ettevaatlikkus real,
  neuroticism real,
  extraversion real,
  openness real,
  agreeableness real,
  conscientiousness real,
  sotsdes real,
  neuroot real,
  ekstrav real,
  avatus real,
  sotsiaal real,
  meelek real
);