Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Automated Culling of Data in a Relational Database for Archiving
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.
2019 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Background. Archiving of legacy information systems is challenging. When no options exist for extracting the information in a structured way, the last resort is to save the database. Optimally only the information that is relevant should be saved and the rest of the information could be removed.

Objectives. The goal is to develop a method for assisting the archivist in the process of culling a database before archiving. The method should be described as rules defining how the tables can be identified.Methods. To get an overview of how the process works today and what archivists think can be improved, a number of interviews with experts in database archiving is done. The results from the interviews are then analysed, together with test databases to define rules that can be used in a general case. The rules are then implemented in a prototype that is tested and evaluated to verify if the method works.

Results. The results point to the algorithm being both faster and able to exclude more irrelevant tables than a person could do with the manual method. An algorithm for finding candidate keys has also been improved to decrease the number of tests and execution time in the worst case.

Conclusions. The evaluation shows results that point to the method working as intended while resulting in less work for the archivist. More work should be done on this method to improve it further.

Place, publisher, year, edition, pages
2019. , p. 46
Keywords [en]
digital preservation, Information Systems, legacy systems, database reverse engineering
National Category
Computer Systems Software Engineering
Identifiers
URN: urn:nbn:se:bth-18261OAI: oai:DiVA.org:bth-18261DiVA, id: diva2:1332882
External cooperation
Sydarkivera
Subject / course
Degree Project in Master of Science in Engineering 30,0 hp
Educational program
PAACI Master of Science in Game and Software Engineering
Presentation
2019-06-05, 11:00 (English)
Supervisors
Examiners
Available from: 2019-08-19 Created: 2019-06-28 Last updated: 2022-05-12Bibliographically approved

Open Access in DiVA

fulltext(795 kB)715 downloads
File information
File name FULLTEXT01.pdfFile size 795 kBChecksum SHA-512
f069b5db1c7d5124f5f203c6f1662d017ca883f873cff3ea0bcd57a1f8bade7fed238bf30b85925426ae05a0ae92bd43d6f0604767e45ef31d78269bd82c61d3
Type fulltextMimetype application/pdf

By organisation
Department of Software Engineering
Computer SystemsSoftware Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 715 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 696 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf