Neither vector nor raster information describes the actual embedded _text_ we're talking about though. The stuff that lets you copy-and-paste _text_ (not images), or search text. PDFs can also have that. And even know what portions of a raster displayed image correspond to what characters. text characters in a PDF aren't vector images, they're actually character bytes encoded with some encoding such as utf-8. On 4/28/2011 5:00 PM, Carl Wiedemann wrote: > I should also remark that vector information and raster information may > exist in the same PDF file. For example, a PDF of a magazine or newspaper > will probably vector text and column borders while photography will be > raster at ~300dpi. > > > On Thu, Apr 28, 2011 at 2:58 PM, Carl Wiedemann<[log in to unmask]>wrote: > >> Generally PDFs are capable of displaying two types of information: Vector >> and Raster. >> >> Vector information is composed of lossless data that describes points, >> smooth lines, gradients, and curves. Vector information is lossless and has >> no native resolution, it can be infinitely scaled. Text data is understood >> as vector information if we were to regard textual documents as images. >> Generally, when composing a document in a word processor and printing it to >> a PDF results in the text as actual vector shapes -- you can zoom-in on the >> text as much as you'd like. PDF readers understand this information as >> native text you can select the text with a cursor, search the text, and >> copy/paste. Other formats like SVG and ESP generally express vector >> information. >> >> Raster information is composed of pixels. JPEG, PNG, GIF, BMP, TIFF are >> examples of raster information. These have a definite resolution, and, from >> a computing perspective, are just a bunch of dots. When you scan an image >> (or a document), it is digitally translated a raster. Digital photographs >> are raster. There are some techniques using Optical Character Recognition >> (OCR) which can actually recognize characters in a raster image and >> transform them into text data. There are also procedures to do a "bitmap >> trace" to attempt to create vector information from a raster image. >> >> More info here >> http://en.wikipedia.org/wiki/Vector_graphics >> http://en.wikipedia.org/wiki/Raster_graphics >> >> >> >> >> On Thu, Apr 28, 2011 at 11:10 AM, Van Mil, James (vanmiljf)< >> [log in to unmask]> wrote: >> >>> I often employ the word 'raster', along with some other foul language, for >>> any PDFs that don't have manipulate-able text. >>> >>> -James >>> >>> -----Original Message----- >>> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of >>> Keith Jenkins >>> Sent: Thursday, April 28, 2011 1:06 PM >>> To: [log in to unmask] >>> Subject: Re: [CODE4LIB] What's the descriptive technical terminology?... >>> pdf image of a page. pdf format used with cut paste. >>> >>> I've also heard many people use the term "searchable PDF" for a text-based >>> PDF. >>> >>> Keith >>> >>> >>> On Thu, Apr 28, 2011 at 12:43 PM, Peter Murray<[log in to unmask]> >>> wrote: >>>> That is the same terminology I use as well -- image-based versus >>> text-based. I find that works most times because people can visually see if >>> something looks like a scanned image. >>> >>