[DeTomaso] any reason to keep old monthly newsletters and Profiles?

Jeff Detrich jjdetrich at gmail.com
Tue Oct 9 13:13:59 EDT 2018


Originally, we were going to use tags, ie. metadata, to identify what was
in an article. Then those tags would be used to search the articles. Tags
actually work better than general google type searches. When you google,
any word that matches in an article would be listed as a match. So if you
were looking for manifold info using google almost every article would show
as a match.

To use tags, you might need 20 or more tags to describe what the article
was about. It could be general functional area, down to specifics like the
brake pads used or motor oil used. Also would  include the year and model
of the car. Attached is a doc giving ideas. Getting to the articles would
use a Boolean search (comma to separate the key words) and then you'd get
back a list of the articles with matches.

The problem with this is that someone would have to go thru each of the
articles and input the words.

Jeff

On Mon, Oct 8, 2018 at 2:38 PM Julian Kift <julian_kift at hotmail.com> wrote:

>    I wrote virtually the same thing to Terry earlier and forgot to copy
>    the list;
>
>    Agreed, but making PDF's fully searchable is not quite as
>    straightforward, or it would have been done. I'm not sure whether
>    all the archives are Word converted to PDF documents or scanned copies
>    of originals, the latter makes it a step harder again and you need an
>    OCR image search capability.
>
>    The first step would probably be a fully searchable index, that at
>    least identifies where the relevant search articles are located. I'm
>    sure that part is not rocket science, but either way Terry is your man!
>
>    I just checked and it looks like everything pre-2013 is scanned in,
>    later is converted/saved from Word document.
>
>    Julian
>      __________________________________________________________________
>
>    From: DeTomaso <detomaso-bounces at server.detomasolist.com> on behalf of
>    Himes, Terry (397C) via DeTomaso <detomaso at server.detomasolist.com>
>    Sent: Monday, October 8, 2018 12:32 PM
>    To: Michael Cox; detomaso at server.detomasolist.com
>    Subject: Re: [DeTomaso] any reason to keep old monthly newsletters and
>    Profiles?
>
>    Eeeech.  I hope not. But, yeah, that would make double-trouble. I gotta
>    get my hands on them
>    to see fer sur.
>    "A Purple Heart proves you were smart enough to hatch a plan,
>     stupid enough to try it and lucky enough to survive!"
>    Terry W. Himes
>    JPL Jet Propulsion Laboratory
>    Dawn Spacecraft Team
>    Juno Systems & Software Team
>    TGO Sequence Lead
>    Phone: (818) 393-6261
>    Cell:     (818) 653-8213
>    thimes at jpl.nasa.gov<[1]mailto:thimes at jpl.nasa.gov>
>    From: Michael Cox <coxmichaelt at gmail.com>
>    Date: Monday, October 8, 2018 at 12:20 PM
>    To: Terry Himes <Terry.Himes at jpl.nasa.gov>,
>    "detomaso at server.detomasolist.com" <detomaso at server.detomasolist.com>
>    Subject: Re: [DeTomaso] any reason to keep old monthly newsletters and
>    Profiles?
>    Terry volunteered:
>    > I probably can do that. I do it every day for Terabytes of telemetry
>    data.
>    >  Right now I am converting 6TB of binary (channelized engineering
>    telemetry) into
>    >  readable and searchable text data.  The 6TB will balloon to over
>    60TB.
>    >
>    >  If your data is text, or has any metadata, it would be much easier.
>    MySQL would
>    >  be easy, but either ElasticSearch or DynamoDB (nosql) would be
>    better, maybe.
>    >
>    >  Of course, I?ve offered to help before?. only Asa Jay has taken me
>    up on it.
>    >
>    >  Terry
>    Since the docs are scanned I would bet they are images compiled into a
>    PDF. You'd
>    have to crank up your fancy OCR software first.  :^)
>       --michael cox
>
> References
>
>    1. mailto:thimes at jpl.nasa.gov
> _______________________________________________
>
>
> Detomaso Email List is not managed by POCA
> Posted emails must not exceed 1.5 Megabytes
> DeTomaso mailing list
> DeTomaso at server.detomasolist.com
> http://server.detomasolist.com/mailman/listinfo/detomaso
>
> To manage your subscription (change email address, unsubscribe, etc.) use
> the links above.
>
> Members who post to this list grant license to the list to forward any
> message posted here to all past, current, or future members of the list.
> They also grant the list owner permission to maintain an archive or approve
> the archiving of list messages.
-------------- next part --------------
   Originally, we were going to use tags, ie. metadata, to identify what
   was in an article. Then those tags would be used to search the
   articles. Tags actually work better than general google type searches.
   When you google, any word that matches in an article would be listed as
   a match. So if you were looking for manifold info using google almost
   every article would show as a match.
   To use tags, you might need 20 or more tags to describe what the
   article was about. It could be general functional area, down to
   specifics like the brake pads used or motor oil used. Also
   wouldA A include the year and model of the car. Attached is a doc
   giving ideas. Getting to the articles would use a Boolean search (comma
   to separate the key words) and then you'd get back a list of the
   articles with matches.
   The problem with this is that someone would have to go thru each of the
   articles and input the words.
   Jeff

   On Mon, Oct 8, 2018 at 2:38 PM Julian Kift <[1]julian_kift at hotmail.com>
   wrote:

     A  A I wrote virtually the same thing to Terry earlier and forgot to
     copy
     A  A the list;
     A  A Agreed, but making PDF's fully searchable is not quite as
     A  A straightforward, or it would have been done. I'm not sure
     whether
     A  A all the archives are Word converted to PDF documents or scanned
     copies
     A  A of originals, the latter makes it a step harder again and you
     need an
     A  A OCR image search capability.
     A  A The first step would probably be a fully searchable index, that
     at
     A  A least identifies where the relevant search articles are
     located. I'm
     A  A sure that part is not rocket science, but either way Terry is
     your man!
     A  A I just checked and it looks like everything pre-2013 is scanned
     in,
     A  A later is converted/saved from Word document.
     A  A Julian
     A  A
     A __________________________________________________________________
     A  A From: DeTomaso <[2]detomaso-bounces at server.detomasolist.com> on
     behalf of
     A  A Himes, Terry (397C) via DeTomaso
     <[3]detomaso at server.detomasolist.com>
     A  A Sent: Monday, October 8, 2018 12:32 PM
     A  A To: Michael Cox; [4]detomaso at server.detomasolist.com
     A  A Subject: Re: [DeTomaso] any reason to keep old monthly
     newsletters and
     A  A Profiles?
     A  A Eeeech.A  I hope not. But, yeah, that would make
     double-trouble. I gotta
     A  A get my hands on them
     A  A to see fer sur.
     A  A "A Purple Heart proves you were smart enough to hatch a plan,
     A  A  stupid enough to try it and lucky enough to survive!"
     A  A Terry W. Himes
     A  A JPL Jet Propulsion Laboratory
     A  A Dawn Spacecraft Team
     A  A Juno Systems & Software Team
     A  A TGO Sequence Lead
     A  A Phone: (818) 393-6261
     A  A Cell:A  A  A (818) 653-8213
     A  A [5]thimes at jpl.nasa.gov<[1]mailto:[6]thimes at jpl.nasa.gov>
     A  A From: Michael Cox <[7]coxmichaelt at gmail.com>
     A  A Date: Monday, October 8, 2018 at 12:20 PM
     A  A To: Terry Himes <[8]Terry.Himes at jpl.nasa.gov>,
     A  A "[9]detomaso at server.detomasolist.com"
     <[10]detomaso at server.detomasolist.com>
     A  A Subject: Re: [DeTomaso] any reason to keep old monthly
     newsletters and
     A  A Profiles?
     A  A Terry volunteered:
     A  A > I probably can do that. I do it every day for Terabytes of
     telemetry
     A  A data.
     A  A >A  Right now I am converting 6TB of binary (channelized
     engineering
     A  A telemetry) into
     A  A >A  readable and searchable text data.A  The 6TB will balloon
     to over
     A  A 60TB.
     A  A >
     A  A >A  If your data is text, or has any metadata, it would be much
     easier.
     A  A MySQL would
     A  A >A  be easy, but either ElasticSearch or DynamoDB (nosql) would
     be
     A  A better, maybe.
     A  A >
     A  A >A  Of course, I?ve offered to help before?. only Asa Jay has
     taken me
     A  A up on it.
     A  A >
     A  A >A  Terry
     A  A Since the docs are scanned I would bet they are images compiled
     into a
     A  A PDF. You'd
     A  A have to crank up your fancy OCR software first.A  :^)
     A  A  A  --michael cox
     References
     A  A 1. mailto:[11]thimes at jpl.nasa.gov
     _______________________________________________
     Detomaso Email List is not managed by POCA
     Posted emails must not exceed 1.5 Megabytes
     DeTomaso mailing list
     [12]DeTomaso at server.detomasolist.com
     [13]http://server.detomasolist.com/mailman/listinfo/detomaso
     To manage your subscription (change email address, unsubscribe,
     etc.) use the links above.
     Members who post to this list grant license to the list to forward
     any message posted here to all past, current, or future members of
     the list. They also grant the list owner permission to maintain an
     archive or approve the archiving of list messages.

References

   1. mailto:julian_kift at hotmail.com
   2. mailto:detomaso-bounces at server.detomasolist.com
   3. mailto:detomaso at server.detomasolist.com
   4. mailto:detomaso at server.detomasolist.com
   5. mailto:thimes at jpl.nasa.gov
   6. mailto:thimes at jpl.nasa.gov
   7. mailto:coxmichaelt at gmail.com
   8. mailto:Terry.Himes at jpl.nasa.gov
   9. mailto:detomaso at server.detomasolist.com
  10. mailto:detomaso at server.detomasolist.com
  11. mailto:thimes at jpl.nasa.gov
  12. mailto:DeTomaso at server.detomasolist.com
  13. http://server.detomasolist.com/mailman/listinfo/detomaso
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 6TechAdmin - KeyWordsIssues.jpg
Type: image/jpeg
Size: 197009 bytes
Desc: not available
URL: <http://server.detomasolist.com/pipermail/detomaso/attachments/20181009/6840a75b/attachment.jpg>


More information about the DeTomaso mailing list