From: Russell Vernon <russellnvernon**At_Symbol_Here**gmail.com>
Subject: [DCHAS-L] Progress report on using AI to read SDS
Date: Mon, 21 Mar 2022 10:25:03 -0700
Reply-To: ACS Division of Chemical Health and Safety <DCHAS-L**At_Symbol_Here**Princeton.EDU>
Message-ID: CAEv1zv141ExTGOwv5Ji_WT2kpqcV_FdVy+sYRkmpMX_zPGVWNA**At_Symbol_Here**mail.gmail.com


Dear Colleagues,
Several weeks ago I requested and participants of this list responded with scanned-in SDS for our AI team to use to read scanned versions of SDS.
I was also challenged to bring back to the group results of our efforts reading SDS...

Well, the team has achieved about 97% accuracy for the specific parts of the SDS they were testing, such as the boiling points, flash point, Hazard Statements, etc.

The machine learning model they started with is the NER (Name Entity Recognition) and they are moving on to testing the NLP ( Advanced Natural Learning Processing) then plan to try out the NLP and Supervised Learning Model afterward.

With the NER model, we have discovered enough differences between vendor SDS structures that we've had to train different vendor documents separately. We hope that will be less problematic with the other models.

In summary, it is working but not ready for prime time yet

Our plan is to still have a chemist review the data extraction to fix any errors as we add data to our chemical inventory library

Sincerely,
-Russ
--
--- For more information about the DCHAS-L e-mail list, contact the Divisional membership chair at membership**At_Symbol_Here**dchas.org Follow us on Twitter **At_Symbol_Here**acsdchas

Previous post   |  Top of Page   |   Next post



The content of this page reflects the personal opinion(s) of the author(s) only, not the American Chemical Society, ILPI, Safety Emporium, or any other party. Use of any information on this page is at the reader's own risk. Unauthorized reproduction of these materials is prohibited. Send questions/comments about the archive to secretary@dchas.org.
The maintenance and hosting of the DCHAS-L archive is provided through the generous support of Safety Emporium.