|
|
dtSearch
Instantly Search Terabyte of
text
across a desktop, network,
Internet,
Intranet site,
web sites, or CD/DVD, all
without paying for annual
subscription.
|
 |
|
|
|
|

|
over two dozen
indexed, unindexed,
fielded and
full-text search
options
|
|

|
highlights hits
in HTML, XML and
PDF, while
displaying embedded
links, formatting
and images
|
|

|
converts other file
types — word
processor, database,
spreadsheet, email
and full-text of
email attachments,
ZIP, Unicode, etc. —
to HTML for display
with
highlighted hits
|
|

|
built-in Spider adds
a third-party or
other Web site
(public, secure
content, password
accessible, etc.) to
your searchable
database
|
|

|
Spider supports
Web-based content
(HTML, PDF, XML,
etc.) as well as
dynamically-generated
content (ASP.NET, MS
CMS, SharePoint,
etc.)
|
|
Searching
Multiple
Data Sources
|
|

|
Federated
searching
or
distributed
searching
provides
integrated
searching of
multiple
data
repositories
across a
network.
|
|

|
Spider
lets you
index local
and remote
web-based
data,
including
both static
and dynamic
data, and
public-access
and secure
sites.
|
|
|
|
Basic Search
Types
|
|

|
Natural
language
searching
lets you
enter a
"plain
English" (or
any other
international
language)
unstructured
search
request.
|
|

|
Phrase
searching
finds
phrases
like:
due process
of law.
|
|

|
Boolean
operators
like
and/or/not
can join
words and
phrases:
due process
of law and
not (equal
protection
or civil
rights).
|
|

|
Proximity
searching
finds a word
or phrase
within "n"
words of
another word
or phrase:
apple
pie w/38
peach
cobbler.
|
|

|
Directed
proximity
searching
finds a word
or phrase
"n" words
before
another word
or phrase:
apple
pie pre/38
peach
cobbler.
|
|

|
Phonic
searching
finds words
that sound
alike, like
Smythe
in a search
for
Smith.
|
|

|
Stemming
finds
variations
on endings,
like
applies,
applied,
applying
in a search
for
apply.
|
|

|
Numeric
range
searching
finds any
number
between two
numbers,
such as
between
6 and
36.
|
|

|
Macro
capabilities
make it easy
to include
frequently
used items
in a search
request.
|
|

|
Wildcard
support
allows ? to
hold a
single
letter
place, and *
to hold
multiple
letter
places:
apple* and
not
appl?sauce.
|
|

|
Regular
expressions
support
provides a
way to
search for
combinations
of
characters.
|
|

|
Digit
character
matching
enables
searching
for patterns
of numbers.
|
|

|
Unicode
support
allows for
searching of
all
Unicode-based
international
languages,
including
support for
"right to
left"
languages
and special
options for
Asian
character
handling.
|
|
| |
|
Fuzzy
Searching
|
|

|
Fuzzy
searching
uses a
proprietary
algorithm to
find search
terms even
if they are
misspelled.
|
|

|
Search
fuzziness
adjusts from
0 to 10 so
you can
fine-tune
fuzziness to
the level of
OCR or
typographical
errors in
your files.
|
|

|
A search for
alphabet
with a
fuzziness of
1 would find
alphaqet;
with a
fuzziness of
3, it would
find both
alphaqet
and
alpkaqet.
|
|

|
Fuzziness is
not built
into the
index, so
you can vary
fuzziness at
the time of
each search.
|
|
| |
|
Concept
/ Synonym /
Thesaurus
Searching
|
|

|
Concept
searching
lets you
look for
fast
and find
quick,
speedy,
etc.
|
|

|
dtSearch
offers
variable
levels of
automatic
synonym
expansion
based on a
comprehensive
semantic
network of
the English
language.
|
|

|
You can also
add your own
thesaurus
terms.
|
|
|
OCR and
Imaging
|
|
|
|

|
dtSearch
supports
the
PDF
"image
with
hidden
text"
format,
and
can
highlight
right
on
the
scanned
image
in
this
format.
|
|

|
dtSearch
also
supports
combined
text
and
image
displays
in
HTML.
|
|

|
dtSearch
Desktop
and
Network
include
a
built-in
image
viewer.
|
|

|
dtSearch
recommends
using
fuzzy
searching
for
sifting
through
possible
OCR
errors.
|
|
|
|
|
OCR and PDF
The Adobe
PDF file
format
provides two
ways to
combine in a
single file
images and
OCR’ed text,
or images
that have
been
converted to
text through
Optical
Character
Recognition
(OCR)
software.
The
"image with
hidden text
format"
stores the
complete
original
image of a
scanned
document,
along with
the text
obtained
through OCR.
The text is
"hidden" in
the sense
that simply
opening the
PDF file
displays
only the
scanned
image, not
the
underlying
OCR'ed text.
Because the
OCR'ed text
is "hidden"
in the file,
however,
dtSearch can
index and
search it.
Another
option for
combining
scanned
images and
OCR’ed text
in a single
PDF file
uses "small
images" for
the parts of
each scanned
page that do
not appear
to be text.
For example,
the format
would store
a picture or
a signature
as a small
image
embedded in
the page.
The format
would store
the
non-picture
portion of
the page
only as
OCR’ed text.
|
|
|
|
|
|