Thursday, September 13, 2012

Reading Notes - Week of September 17, 2012

"Data Compression" from Wikipedia

  • Data compression involves encoding information using fewer bits that the original representation
  • Two types of compression
    • Lossless
      • Reduces bits by identifying and eliminating statistical redundancy
    • Lossy
      • Reduces bits by identifying marginally important information and removing it
  • Formally known as source-coding
  • Helps reduce resource usage (ex. Data storage space, transmission capacity)
  • Theoretical background
    • Lossless – Information Theory
    • Lossy – rate-distortion theory
"Data Compression Basics"

  • Part 1: Lossless Data Compression
    • Fundamental idea behind data compression is to take a given representation of information and replace it with a different representation that takes up less space, from which the original data can later be recovered
    • If the recovered information is guaranteed to be exactly identical to the original, then the compression method is described as “lossless”
    • A simple lossless compression algorithm is “run-length encoding” (RLE)
      • Replaces long runs of characters with a single character and the length of the run
    • Lempel-Ziv compressor family
    • Entropy coding
      • Assigns codes to blocks of data in a way that the length of the code is inversely proportional to the statistical probability of the block of data
    • Prediction and error coding
  • Part 2: Lossy Compression of Stills and Audio
    • important to distinguish data from information
    • fundamental idea behind lossy compression is preserving meaning rather than preserving data
    • by allowing for some deviation from the source data when encoding patterns, lossy compression greatly reduces the amount of data required to describe the “meaning” of the source media
    • lossy compression is ideally applied to information that is meant to be interpreted by a reasonably sophisticated “meaning processor”(human, image recognition software, etc.) that looks at a representation or rendering of the data rather than the data itself
Edward A. Galloway. “Imaging Pittsburgh: Creating a Shared Gateway to Digital Image Collections of the Pittsburgh Region.”

  • The main focus of the project was to create a single Web gateway for the public to access thousands of visual images held in the collections of the Pitt Archives Service Center, CMOA, and the Historical Society of Western PA
  • The content partners were responsible for selections of collections/images, describing/cataloging images, digitization, and delivering images/metadata to DRL
  • DRL was responsible for providing access to the image collections via DLXS middleware
  • Characteristics of the Web gateway
  • Conduct keyword searches across all image collections
  • Browse images
  • Read about the collections and their contents
  • Explore images by time/place/theme
  • Order image reproductions
  • Communication challenges
  • Selection challenges
  • Metadata challenges
  • Project-wide vs local needs
  • Workflow challenges
  • Website development challenges

I was not able to access "Youtube and Libraries: It Could be a Beautiful Relationship" by Paula L. Webb.





No comments:

Post a Comment