Helsinki's city administration is working through a backlog of thousands of duplicate photographs embedded in its digital planning and cultural heritage databases, a problem that archivists say stems from a decade of overlapping digitisation projects that were never properly reconciled with one another.
The issue matters now because the city is midway through a major open-data push tied to its Helsinki City Strategy 2021–2025, which committed municipal departments to making planning documents, historical records and urban development materials publicly searchable. Duplicate images slow indexing, inflate storage costs and, more practically, confuse residents and researchers trying to trace the history of specific streets or buildings. When the same photograph appears under two different metadata entries — sometimes with conflicting dates or location tags — it undermines the reliability of the entire archive.
A Tangle of Digitisation Projects
The roots of the problem go back to at least 2013, when Helsinki City Museum on Aleksanterinkatu launched its first large-scale scan of analogue photographic collections covering the city's 20th-century development. That project ran independently of a parallel initiative at the Helsinki Urban Environment Division on Kyläsaarenkuja, which was digitising planning photographs and site inspection images. Both projects uploaded material to separate content management systems. When the city later attempted to migrate both sets of records into a unified platform under the Helsinki Metropolitan Area Libraries' shared Finna service, deduplication was handled manually — and incompletely.
Staff at Helsinki City Museum have described the situation in internal planning documents as a structural legacy issue rather than a one-off error. The Finna platform, which is administered nationally by the National Library of Finland and serves dozens of Finnish cultural institutions, does have automated duplicate-detection tools, but those tools depend on consistent metadata tagging. Helsinki's two source systems used different field conventions for photographer names, address formats and collection identifiers, which meant automated matching failed on a significant share of records.
The Kallio neighbourhood archive is one concrete example of where the duplication runs deepest. Kallio was extensively photographed during the urban renewal programmes of the 1970s and again during the residential rehabilitation work of the 1990s. Images from both periods were digitised at least twice — once from original negatives and once from printed contact sheets — and entered into the system as separate records without cross-referencing.
Scale of the Problem and the Fix Now Underway
Exact figures have not been published by the city, but Helsinki City Museum's annual report for 2024 noted that its digitised photographic holdings had grown to over 180,000 items accessible through Finna, with an internal audit flagging a meaningful proportion as requiring review for duplication or conflicting metadata. The museum launched a dedicated metadata correction project in January 2025 with a budget line under the city's cultural services allocation.
The practical approach being taken involves a combination of hash-based image matching — which can identify pixel-identical files regardless of filename — and manual review for near-duplicate images where lighting or cropping differs slightly between scans. Staff are working district by district, beginning with Töölö and Punavuori, both of which have dense photographic records tied to planning disputes and heritage protection applications over the years.
The Finna platform itself received a metadata standards update in March 2026 that tightened field requirements for new uploads from all contributing institutions, which archivists say will prevent the problem from growing. But it does nothing to resolve what is already in the system.
For residents and researchers, the most immediate practical step is to cross-reference any image found through Finna with Helsinki City Museum's own catalogue at hkm.fi, which is being updated in real time as the correction project proceeds. Anyone using archival photographs for planning applications, academic research or neighbourhood history projects should check both sources and note the record identifier number — not just the filename — when citing an image, since identifiers are stable even when metadata is corrected. The city expects the Töölö and Punavuori reviews to be completed before the end of 2026.