I am one of those late adopters of new technologies. I was one of the last people I knew to get a cell phone in roughly 2002 to 2003. Thus, I waited a few months to play with ChatGPT as others discussed its virtues. But as I pondered its potential applications to my research, I thought it might be interesting to ask it to code a network from a criminal indictment and compare it to a human coded network. From there we could run a Quadratic Assignment Procedure (QAP) analysis to determine how similar the two networks were. I had hypothesized that I would find a high degree of similarity between the CHATGPT and human coded networks. Maybe a few edges/ties would off in the network between one generated by the Chat GPT powerful AI and the human coder.
However, my recent experience with ChatGPT (April 17, 2023) left me disturbed. I have read many articles about people applying ChatGPT to simple tasks which we previously considered complex. In my research I often take criminal court documents such as publicly available indictments and code them for criminal networks.[i] These are typically operational, or co-offender networks derived from indictments or criminal complaints in affidavits in support of those indictments. I tend to focus on broad conspiracy cases because those are the places where the most interesting networks are developed by law enforcement investigations. Thus, one of the biggest tasks in my research is the data entry/coding.
The data entry consists of reading the document in determining the connections between individuals in this case, people accused of crimes with other individuals in the network including overt acts in support of continuing criminal enterprises. This is typically done in the form of an edge list. An edge list is effectively two columns of data, pairs of names of people who are connected to each other. My co-authors and I, often code the type of edge that we are connecting in a third column. For example, is this a phone conversation? or is it an in-person meeting between two actors committing a crime together? Typically, we code one mode networks (people to people), but sometimes we will code two-mode networks such as when criminals attend stash houses or meetings.[ii] From there the connections are between individuals to meetings and connections between individuals are inferred by their mutual meeting attendance. Thus, the initial coding of the data is important.
I thought ChatGPT might be able to produce an edge list based on publicly available court indictments. I thus decided to try with a recently announced and fully public Sinaloa Cartel indictment from the Southern District of New York.
I began my conversation with ChatGPT by asking it if it knew what an edge list was. It clearly did. It generated an example of a sample edge list formatted in two columns with a source and target heading at the top. This is a common format for people who use an open-source social network analysis program known as Gephi. So far so good. It described graph theory which is a mathematical discipline from which much social network analysis is derived. Everything’s good so far.
It got a little trickier when I tried to upload an indictment. This was mostly my fault, I decided I needed to update my version of Adobe and make sure that this was a readable format, i.e., that it was not an image but was readable text in the PDF format. I then saved it as a text file and I attempted to simply cut and paste the text into ChatGPT, but that was too large. It gave me options for uploading the text of the indictment and I chose to upload it via Google Drive and share the link.
I had given it instructions on how I wanted the edge list coded. The rules were simple: I wanted operational ties between individuals within the network from the indictment and no hearsay ties. ChatGPT indicated that it understood what I meant. It then took some time to generate an edge list.
When it did generate that edge list, it looked odd. It was indeed two columns of data, the first column was almost entirely composed of individual #1 with other individuals connected to it. I have no way of knowing if this was only a partial edge list, but I expected that this publicly available Sinaloa Cartel indictment which had been recently released by the Department of Justice would have more edges within it.
At 65 pages it was likely to contain more connections between criminal actors. What was generated was a fairly small list maybe a few dozen ties at maximum. But the data was effectively useless i.e. “individual one,” “individual two,”… individual 56, these were meaningless without the names of the individuals or “labels” in the parlance of Gephi or UCINET.[iii]
I asked ChatGPT to tell me the names of the individuals it had coded, and it confidently told me the names of the individuals in the edge list. But it looked very strange (I know I am a pesky human). They were not the names I was expecting to see.
In fact, when I went back and I checked ChatGPT’s work, not a single one of the names it reported to me from the criminal indictment was actually in the criminal indictment. This was shocking to me. What did this data mean? I asked ChatGPT where the names had come from, and it told me that it only got information from the criminal indictment I had submitted. I thought that was interesting given that none of the names appeared there.
Thus, I asked a follow-up question explaining to it that I knew none of the names were there and it told me that the names came from its language model. In short, ChatGPT had created an edge list based on names that were completely made-up and not in the underlying document I had submitted.
My experience with ChatGPT simply making things up, is not unique and has been written about extensively by others. But in my line of research, it is rather important to get things right.
ChatGPT at that point told me that to do what I really wanted it to do, I would have to go through and identify all the names in the indictment myself and identify who I wanted connected. This is called an appendix and has been used in attempts to automate the creation of networks by automated text analysis. I am largely skeptical of these methods for reasons identified by Kenney and Coulthart (2015) in their excellent analysis of this method and the need to cross reference it with ethnography.[iv] For example, they found that in this type of research, Osama bin Laden is very central in networks he has nothing to do with because he gets mentioned a lot. They argue for ethnography to weed these ties out.
At that point I realized it would be simpler to have one of my students code the indictment based on rules I explained to my student rather than anything that ChatGPT might generate. I might later take the human generated nodelist and give CHatGPT another go in the future.
As a recent John Oliver episode described this is the problem of the black box with artificial intelligence.[v] I am limited in what I can understand about how ChatGPT generated the edge list I requested. Did it misunderstood what I meant by the word individual? Did it misunderstand what I meant by operational? I am perfectly willing to admit fault in giving bad instruction or asking unclear questions based upon my assumptions about the definitions of words. But the best I can tell, ChatGPT just made up the names whole cloth and the underlying network was likely also a fiction.
Indeed, in the conversation I had significant back and forth with ChatGPT that was really my error about its capabilities and which format I had actually submitted the data in. What was straining to me was the confidence with which chat GPT was willing to spit back completely made-up information.
I don’t know what the future of artificial intelligence holds, but as of now it is not ready to assist me with my research. I’m glad that I often work with small manageable data sets that I can code by hand and that me, my students at the undergraduate and graduate levels, and my co-authors can read ourselves, synthesize, code, fact check, and verify by hand. It leaves me with great confidence in the underlying data in our research. I wanted to be seduced by the siren song of automation, but thus far I’m not.
[i] Nathan P. Jones et al., “A Mixed Methods Social Network Analysis of a Cross-Border Drug Network: The Fernando Sanchez Organization (FSO),” Trends in Organized Crime 23, no. 2 (June 1, 2020): 154–82, https://doi.org/10.1007/s12117-018-9352-9; Isaac Poritzky, Nathan P. Jones, and John P. Sullivan, “Transnational Cartels and Prison/Jail Gangs: A Social Network Analysis of Mexican Mafia (Eme) and La Familia Michoacana Conspiracy Cases,” Small Wars Journal, October 24, 2022, https://smallwarsjournal.com/index.php/jrnl/art/transnational-cartels-and-prisonjail-gangs-social-network-analysis-mexican-mafia-eme-and; John P. Sullivan, Nathan P. Jones, and Robert J. Bunker, “Third Generation Gangs Strategic Note No. 46: Los Angeles Strike Force Investigation into Alleged Weapons Trafficking Organization Providing Weapons and Ammunition to the Cártel Jalisco Nueva Generación (CJNG),” Small Wars Journal: El Centro, January 31, 2022, 14, https://smallwarsjournal.com/jrnl/art/third-generation-gangs-strategic-note-no-46-los-angeles-strike-force-investigation-alleged.
[ii] Sullivan, Jones, and Bunker, “Third Generation Gangs Strategic Note No. 46: Los Angeles Strike Force Investigation into Alleged Weapons Trafficking Organization Providing Weapons and Ammunition to the Cártel Jalisco Nueva Generación (CJNG).”
[iii] Stephen P Borgatti, Martin G Everett, and Linton C Freeman, “Ucinet for Windows: Software for Social Network Analysis,” 2002, https://www.researchgate.net/publication/216636663_UCINET_for_Windows_Software_for_social_network_analysis.
[iv] Michael Kenney and Stephen Coulthart, “The Methodological Challenges of Extracting Dark Networks: Minimizing False Positives through Ethnography,” in Illuminating Dark Networks: The Study of Clandestine Groups and Organizations, ed. Luke Gerdes (Cambridge: Cambridge University Press, 2015), https://www.researchgate.net/publication/321553872_The_methodological_challenges_of_extracting_dark_networks_Minimizing_false_positives_through_ethnography.