Data model¶
Here are described the main entities of eKorpus data model. Clickable map:
Users¶
User¶
For each system user (administrator, speaker or testee) a record is created. The database contains only anonymous information.
CREATE TABLE "user" (
id serial PRIMARY KEY,
name character varying(40) NOT NULL,
roles character varying(256),
"password" character varying(40),
gender character varying(1),
age integer,
education character varying(8),
nationality character varying(8),
"language" character varying(2),
language2 character varying(2),
delay timestamp without time zone,
blocked integer,
ignore integer,
empathy integer,
experience integer
);
The fields are:
id: | autogenerated unique key |
||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
name: | user name. It is used to login to eKorpus system. |
||||||||||||||
roles: | only role used is admin. Users with admin role have full access to eKorpus. |
||||||||||||||
password: | secret password. It is used to login to eKorpus system. |
||||||||||||||
gender: | 0 - male or 1 - female |
||||||||||||||
age: | in years. use birth date |
||||||||||||||
education: | Following values are used:
|
||||||||||||||
nationality: | Following values are used: et, lv, ru, fi, –, ... |
||||||||||||||
language: | mother tongue. et, lv, ru, se, fi, – ... |
||||||||||||||
language2: | education language |
||||||||||||||
delay: | @TODO unused? |
||||||||||||||
blocked: | user is blocked from logging into system |
||||||||||||||
ignore: | ignore user test results in reports |
||||||||||||||
empathy: | empathy test score |
||||||||||||||
experience: | general work experience in years |
Todo
user.age replace with birth date.
Todo
user.delay hm, remove unused.
Groups¶
User groups are defined in table groups.
CREATE TABLE groups (
id serial PRIMARY KEY,
text character varying(256),
code character varying(8)
);
The fields are:
id: | autogenerated unique key |
---|---|
text: | displayname. |
code: | code, identifies group in html, etc |
User groups¶
Users are assigned to groups in table usergroups.
CREATE TABLE usergroups (
user_id integer NOT NULL REFERENCES "user"(id) ON DELETE CASCADE,
groups_id integer NOT NULL REFERENCES groups(id) ON DELETE CASCADE,
);
Recordings¶
The recordings are stored as WAV files. Database contains only metainformation:
- recording information
- segmentation layers
- segments of each segmentation layer
Recording¶
CREATE TABLE recording (
id serial PRIMARY KEY,
"comment" character varying(256),
user_id integer REFERENCES "user"(id),
rec_date timestamp without time zone,
file character varying(256)
);
The fields are:
id: | autogenerated unique key |
---|---|
comment: | free text helping to identify the recording |
user_id: | reference to speaker record in User table |
rec_date: | recording date |
file: | file name. This is a relative path. |
Segmentation¶
Each recording can have multiple segmentation layers.
CREATE TABLE segmentation (
id serial PRIMARY KEY,
recording_id integer NOT NULL REFERENCES recording(id),
"type" character varying(32),
"comment" character varying(256)
);
The fields are:
id: | autogenerated unique key |
---|---|
recording_id: | reference to the Recording table |
type: | segmentation type: sentence, word, phoneme, .. |
comment: | free text helping to identify the segmentation |
Segment¶
Each segmentation consists of multiple segments.
CREATE TABLE segment (
id serial PRIMARY KEY,
segmentation_id integer NOT NULL REFERENCES segmentation(id) ON DELETE CASCADE,
label character varying(4000),
"start" integer NOT NULL,
"end" integer NOT NULL,
jnr integer DEFAULT 0
);
The fields are:
id: | autogenerated unique key |
---|---|
segmentation_id: | |
reference to the Segmentation table | |
label: | segmentation type specific markup |
start: | segment start time relative to recording start |
end: | segment end time relative to recording start |
jnr: | position number in parent segmentation unit. |
Test definitions¶
Tests are defined as a sequence of multiple choice items (questions). There are two types of tests:
- perception tests
- generic multiple choice tests
All perception test items have same item question and options list. They also have an associated sound segment.
Test¶
CREATE TABLE test (
id serial PRIMARY KEY,
selection_id integer REFERENCES selection(id),
text character varying(256),
"comment" character varying(256),
instruction character varying(4000),
lang character varying(2),
"type" character varying(8)
);
The fields are:
id: | autogenerated unique key |
---|---|
selection_id: | reference to the selection table, the default options list |
text: | item question for perception tests |
comment: | free text helping to identify the test |
instruction: | test instruction |
lang: | test language |
type: | perception or generic |
Item¶
Each test consists of multiple items.
CREATE TABLE item (
id serial PRIMARY KEY,
test_id integer NOT NULL REFERENCES test(id) ON UPDATE CASCADE ON DELETE CASCADE,
selection_id integer REFERENCES selection(id),
nr integer,
text character varying(256),
first_id integer REFERENCES segment(id),
last_id integer REFERENCES segment(id),
voiced_item_id integer REFERENCES item(id)
);
The fields are:
id: | autogenerated unique key |
---|---|
test_id: | reference to the test table (parent) |
selection_id: | reference to the selection table (children) |
nr: | item sequence nr |
text: | item question text |
first_id: | start segment id for perception tests |
last_id: | end segment id for perception tests |
voiced_item_id: | reading tests have here listening test item id. |
Selection¶
Each item has an options list which is identified by a record in selection table.
CREATE TABLE selection (
id serial PRIMARY KEY,
"comment" character varying(256),
code character varying(8),
lang character varying(2)
);
The fields are:
id: | autogenerated unique key |
---|---|
comment: | free text helping to identify the options list |
code: | unique id for data exports |
lang: | test language |
Choice¶
Options belonging to an options list are defined in the choice table.
CREATE TABLE choice (
id serial PRIMARY KEY,
selection_id integer NOT NULL REFERENCES selection(id) ON UPDATE CASCADE ON DELETE CASCADE,
nr integer,
text character varying(256),
code character varying(8),
english character varying(256)
);
The fields are:
id: | autogenerated unique key |
---|---|
selection_id: | reference to the selection table, parent |
nr: | choice sequence nr |
text: | choice text |
code: | unique id for data exports |
english: | translation? |
Testuser¶
CREATE TABLE testuser (
test_id integer REFERENCES test(id) ON DELETE CASCADE,
user_id integer REFERENCES "user"(id) ON DELETE CASCADE
);
Responsekey¶
Generic tests can have a response key for calculating test scores.
CREATE TABLE responsekey (
item_id integer REFERENCES item(id) ON UPDATE CASCADE ON DELETE CASCADE,
nr integer,
code character varying(8),
value integer
);
The fields are:
id: | autogenerated unique key |
---|---|
item_id: | reference to the item table |
nr: | choice sequence nr |
code: | unique id for data exports |
value: | score value |
Test sessions¶
Test are executed in user sessions.
Session¶
CREATE TABLE "session" (
id serial PRIMARY KEY,
user_id integer NOT NULL REFERENCES "user"(id) ON UPDATE CASCADE ON DELETE CASCADE,
test_id integer NOT NULL REFERENCES test(id) ON UPDATE CASCADE ON DELETE CASCADE,
score character varying(256),
ts timestamp without time zone
);
The fields are:
id: | autogenerated unique key |
---|---|
user_id: | reference to the user table |
test_id: | reference to the test table |
score: | score value |
ts: | test session start time |
Response¶
For each test item a user response is stored.
CREATE TABLE response (
id serial PRIMARY KEY,
session_id integer NOT NULL REFERENCES "session"(id) ON UPDATE CASCADE ON DELETE CASCADE;,
item_id integer NOT NULL REFERENCES item(id) ON UPDATE CASCADE ON DELETE CASCADE,
code character varying(8),
ts timestamp without time zone
);
The fields are:
id: | autogenerated unique key |
---|---|
session_id: | reference to the session table, parent |
item_id: | reference to the item table |
code: | choice code |
ts: | timestamp |
Supporting entities¶
In order to optimize the reports performance for each segment participating in perception tests the response scores are calculated and stored.
CREATE TABLE segmentscore (
segment_id integer NOT NULL REFERENCES segment(id) ON DELETE CASCADE,
code character varying(8) NOT NULL,
value integer,
text_code character varying(8) NOT NULL,
text_value integer
);
The fields are:
segment_id: | reference to the segment table, parent |
---|---|
code: | choice code |
value: | |
text_code: | |
text_value: |
CREATE TABLE profile (
id serial PRIMARY KEY,
name character varying(40) NOT NULL,
gender character varying(1),
age integer,
education real,
arevus real,
viha real,
masendus real,
ujedus real,
liialdamine real,
abitus real,
sobralikkus real,
seltsivus real,
kehtestavus real,
aktiivsus real,
elamustejanu real,
roomsameelsus real,
kujutlusvoime real,
kunstilisedhuvid real,
tundelisus real,
vahelduseiha real,
intellektuaalsus real,
avatusvaartustele real,
usaldus real,
kolbelisus real,
omakasupyydmatus real,
leplikkus real,
tagasihoidlikkus real,
kaastundlikkus real,
enesetohusus real,
korralikkus real,
kohusetundlikkus real,
saavutusvajadus real,
enesedistsipliin real,
ettevaatlikkus real,
neuroticism real,
extraversion real,
openness real,
agreeableness real,
conscientiousness real,
sotsdes real,
neuroot real,
ekstrav real,
avatus real,
sotsiaal real,
meelek real
);