Topics covered in this Part:
Categories of digital objects, including digital data
Eligibility of trade secret protection for digital objects
Management of digital trade secrets
Challenges and risks of digital trade secrets, cyber-attacks, audits
Trade secrets vs. other IP rights for digital objects
With the rapid advancement of digital technologies, businesses face unique challenges in protecting their valuable proprietary knowledge and information. While the “traditional” patent-centric intellectual property protection strategies may still be valid in this technology sector, trade secret protection has emerged as one of the crucial protection regimes for digital technologies.
Since trade secrets potentially cover a wide range of digital data and information, for the purpose of this Part, the term “digital objects” is used. It generally refers to data or information that is stored and transmitted in electronic or digital formats. Consequently, trade secret protection of digital objects may encompass two distinct areas:
digital data (in a text, audio or image format), algorithms or programming code as such is valuable trade secret information, and
trade secret information in any technical field, stored in a digital format (such as information about a method of manufacturing substance X stored in a digital file).
In Part VII, we primarily focus on the first category of the “digital objects.” However, the IT security measures addressed in Sections 4 and 6 in this Part may apply to any trade secret information in a digital format (see also Part IV: Trade secret management in general and Section 2.3, in particular).
1. Emergence of digital objects and potential for trade secret protection
Digital objects play a pivotal role in today's business environment. With the advent of cloud storage and computing, electronic communications, advanced data analytics, and large language models like GPT-4, organizations rely heavily on digital platforms. The nature of digital objects presents both advantages and challenges for trade secret protection.
On the one hand, digital formats allow for efficient storage, replication, computation and transmission of information. On the other hand, these very characteristics also increase the risks of unauthorized disclosure, theft or exploitation, based on the ease with which digital data can be copied, shared and disseminated.
To be eligible for trade secret protection, digital objects need to meet the basic requirements for trade secret protection, i.e., they need to be:
commercially valuable because they are secret,
known only to a limited group of persons, and
subject to reasonable steps taken by the rightful holder of the information to keep it secret, including the use of confidentiality agreements for business partners and employees.
To determine the eligibility of digital objects for trade secret protection, it is necessary to analyze the subject matter of the “digital object” and identify whether its secrecy makes that subject matter commercially valuable – before addressing specific access-control and confidentiality schemes.
2. Subcategories of digital objects and eligibility for trade secret protection
Digital objects comprise various elements such as algorithms, code (source and object), text, images, audio and video. When considering trade secret protection, there are very important nuances between various subcategories of digital objects and their commercial value.
Algorithms
Algorithms are the backbone of digital data processing. They are sets of rules or instructions that define a sequence of steps to solve a specific problem or accomplish a particular task. Algorithms play a crucial role in transforming raw data into meaningful information through data analysis, machine learning and artificial intelligence applications. They provide the logic and computational framework that allows digital data to be processed, analyzed and interpreted to derive valuable insights. As such, algorithms are at the very heart of digital economies, enabling data-driven decision-making, predictive modeling and automation.
Code
Code, also known as computer code or programming code, is the language used to write software programs and to execute algorithms. It consists of a series of instructions and commands that direct computers to perform specific tasks or operations. Code serves as the bridge between human intent (source) and machine execution (object), enabling the translation of algorithms into executable programs. Through code, raw or pre-processed data is transformed, changed and presented in ways that deliver desired functionalities and user experiences. Code is the fundamental building block of software applications, systems and platforms that harness and use digital data.
Raw data and processed data
Raw data can be seen as the most fundamental form of data provided as an initial, unprocessed output that is obtained directly from data sources, e.g., a stream of numeric values or readings from a sensor captured over time, or unstructured text documents or messages. Raw data is data in its most granular and detailed form, consisting of individual data points or records, and usually characterized by its lack of structure or organization, and it may require preprocessing, cleaning and formatting before it can be effectively analyzed or used for decision-making purposes. It serves as the foundation for subsequent data processing steps, such as data transformation, aggregation, analysis and visualization.
Raw data, in and of itself, is generally not considered for trade secret protection as it typically is the analysis, insights or processes derived from that data that hold the potential for trade secret protection. However, it is important to note that there can be exceptions and nuances to this general rule. In some cases, raw data may have commercial value if it is unique, difficult to obtain, is associated with a particular object, location, party or individual, and its collection methods are proprietary.
In contrast to raw data, processed data is data that already has undergone some form of algorithmic or analytical processing to extract meaningful insights or transform it into a more structured and usable format. Processed data is typically more refined, structured and tailored to specific objectives or analysis requirements – and as such of higher commercial value than raw data.
Metadata
Metadata refers to descriptive information that provides context, structure and additional insights about (other) digital data, be it raw data or processed data. It includes attributes such as the creation date, author, file format, size, location and relationships with other data elements. Metadata serves as a form of data about data, facilitating organization, searchability and interoperability of digital information. It helps users understand the content, source and characteristics of data, making it easier to locate, retrieve and analyze specific information. In the context of digital data, metadata plays a vital role in data management, data integration and data governance processes.
Together, algorithms, code and metadata form the foundation for the aggregation, processing and interpretation of (raw and pre-processed) digital data. As the digital landscape continues to evolve, these components will remain essential in utilizing the power of digital data for innovation, problem-solving, automation and decision-making in various domains and industries.
To facilitate understanding of the subcategories of digital objects, we could consider a comprehensive traffic information app as an illustrative example of an end product that encompasses various digital objects.
A traffic information app utilizes a combination of algorithms, code, text, images, audio and video to provide real-time traffic updates and navigation assistance to its users.
Algorithms: the exemplary app employs complex algorithms to analyze data from various sources, such as GPS signals, traffic cameras and user-generated reports, to determine traffic congestion, optimal routes and estimated travel times.
Code (source and object): the app's source code is the underlying programming instructions that dictate how it functions. Compiled into object code, this enables the app to execute seamlessly on users' devices, facilitating smooth interaction and data processing.
Text: through a user-friendly interface, the app displays textual information, such as current traffic conditions, accident reports, road closures and suggested alternative routes. Users can easily read and understand the textual updates and directions provided by the app.
Images: the app integrates images from traffic cameras strategically positioned across roadways, offering users visual insights into real-time traffic conditions. Users can view snapshots or streaming video feeds of congested areas, accidents or construction sites to make informed decisions.
Audio: to enhance user experience and safety, the app provides auditory cues and directions. It may use audio notifications to update users on upcoming turns, lane changes or traffic incidents, allowing drivers to focus on the road while receiving critical information.
Video: incorporating video clips or animations, the app can present dynamic visual representations of traffic flow and road situations. For instance, it could display animated maps illustrating traffic patterns or use video overlays to highlight specific incidents and detours.
By encompassing these elements within its digital framework, the exemplary app epitomizes the term "digital objects," showcasing the diversity and integration of algorithms, code, text, images, audio and video to deliver a comprehensive and interactive traffic information solution to users.
3. “Confidentiality” of digital data, metadata, algorithms and code
To meet a very fundamental prerequisite for trade secret protection, digital objects need to be confidential. This is, in fact, the biggest challenge in practice as organizations share, handle and process various digital objects on a daily basis. Ensuring confidentiality is essential to safeguard sensitive data, algorithms and code from unauthorized exposure or exploitation.
3.1 Raw and processed digital data and metadata
Collection of data
IoT (internet of things), cell phones, payment processing terminals and other electronic communication devices gather hundreds of millions of data points a day. The mere fact of data collection itself is not necessarily a trade secret, but “how” and “what” is collected may lend itself to some type of protection.
For example, an industrial IoT device may use proprietary sensors or other means to collect operating parameters that were not previously obtainable, thus creating a unique data set. If that dataset is encrypted at the time of collection, then the owner could argue that such data is a trade secret because “encryption” would be a reasonable step to maintain secrecy. Similarly, the mere fact that a credit card processor obtains a consumer’s spending behavior at point of sale is not trade secret, but what it collects, and in what format, may be trade secret – especially if it is associated with metadata that is not easily obtainable by the consumer or merchant.
Data collected on behalf of a third party
What about data that is collected on behalf of a third party or data that is not “owned” by the collector or processor? These scenarios raise important questions around asserting trade secret protection. As mentioned above, trade secrets are an intangible asset, thus only the data owner (or exclusive user in certain jurisdictions) can assert the intellectual property right.
Data collected on behalf of a third party usually arises in a consulting or vendor style relationship, where the collector and data generator are in contractual privity. The contract terms typically govern who owns what type of data (e.g., raw or processed) and how the parties should treat such data. Asserting trade secret protection on such data will require a contractual review to determine who is the true owner and associated confidentiality/use restrictions. For example, an industrial IoT company that installs and manages IoT sensors will most likely have a service contract with the data owner that governs, among other things, what type of data are collected, how the data are managed, how the data are processed, and the respective rights to use such data. In this situation, both the IoT company and data owner can likely claim trade secret protection on certain aspects of the data.
Collected data without contractual relationship
Conversely, data collected and processed by a third party without contractual privity may be eligible for trade secret protection: but who can assert such protection is a murky issue. These situations most often arise in the consumer context where a payment processor, point-of-sale vendor or other parties involved in the processing of consumer payments handles consumer data. If a credit card is used, the consumer may be in contractual privity with the card issuing bank that dictates the type of data collected and use rights, but is most likely not in privity with the merchant, point-of-sale vendor or processor (e.g., VISA).
Here, each party may “claim” trade secret protection on certain aspects of the data, but who else within the payment chain also has access to the data, and is there contractual privity that governs confidentiality and use? Without clearly defining these limitations (and establishing clear ownership or exclusive use rights), such data may not fall within the definition of trade secrets. The next section will explore these use rights in more detail.
Exchanging and sharing of data
With the advent of social media, e-commerce, industrial analytics and digital economies, data sharing (e.g., via application programming interfaces (APIs)) and commercial data re-sale have become increasingly vital for commercial applications. The importance of data sharing lies in its ability to foster collaboration and accelerate progress. In an era characterized by vast amounts of data being generated across diverse disciplines, sharing data allows researchers to access a broader pool of information, facilitating cross-disciplinary insights and discoveries. Today, APIs play a crucial role in enabling seamless integration and interoperability between different systems and platforms by providing standardized interfaces for data access and communication.
Furthermore, the rise of commercial data re-sellers and data-sharing agreements has opened up new opportunities for everyone to access valuable datasets that were previously inaccessible or too time-consuming to collect. These re-sellers curate and aggregate data from multiple sources, applying advanced analytics and quality control measures, making it readily available for scientific endeavors. While challenges related to data privacy, ethics and data quality persist, the increasing importance of digital data sharing, APIs and commercial data re-sellers signifies a paradigm shift towards collaborative and data-driven research, ultimately enhancing scientific knowledge and innovation across domains.
Shared data with limited access
This new collaborative dimension of data can make it difficult to determine whether specific digital data is a trade secret – who has access to it and which reasonable steps have been taken by the rightful holder of the information to keep it secret – and if so, how to manage and exploit the trade secret.
In practice, confidentiality for shared data (raw, processed and metadata) is often established, where applicable,
There is most often a contractual access control as well. The purpose of limited access is to ensure that only authorized individuals can view or use the shared data, while still enabling collaboration and information exchange within a defined and trusted network of users. When trying to assert trade secret status for shared data, the data proprietors typically struggle to assert the secrecy and the level of granularity regarding access control for shared data with limited access.
Having said that, it can be difficult to establish precise borders between shared data with limited access and trade secrets as the distinction lies in the purpose, scope and level of confidentiality associated with each concept in each individual case. But as a rule of thumb and in contrast to the concept of shared data with limited access, data with trade secret status is typically not shared or disclosed to anyone outside the organization that owns them and subject to very strict organization-internal access-control schemes, as will be elaborated below with regard to code and algorithms.
To illustrate how important it can be to differentiate between trade secret protection and protection as shared data with limited access, one can consider the following hypothetical example around a proprietary algorithm for optimizing energy consumption in industrial manufacturing processes.
Organization A offers and sells energy optimization software. The heart of Organization A’s energy optimization software is its proprietary algorithm, which provides a significant competitive advantage in the market. This algorithm is a trade secret that embodies years of research, development and testing.
Organization A takes extensive measures to keep this algorithm confidential and secure. The company restricts access to the algorithm to a small team of trusted software engineers who have signed strict non-disclosure agreements. The algorithm's source code is stored on secure servers with advanced encryption, and access is granted only on a need-to-know basis. No external parties, including collaborators or partners, have access to the complete algorithm, ensuring that Organization A maintains exclusive control over this valuable trade secret.
Now, Organization A decides to collaborate with a research institution, Organization B, to advance the field of energy-efficient manufacturing. As part of the collaboration, Organization A shares certain data related to manufacturing processes and energy consumption trends with Organization B. However, to protect their intangible assets, Organization A sets up an agreement that limits Organization B’s access to specific subsets of data. This limited access allows Organization B’s researchers to analyze the data for research purposes only, ensuring that the core proprietary algorithm and implementation details remain confidential and inaccessible. Organization A retains control over the key trade secret – the precise algorithm – which is never shared with Organization B or any other party.
In this example, the shared data with limited access represents a controlled sharing of non-core information with a collaborator, while the trade secret encompasses the core proprietary algorithm, which is kept strictly confidential within the company. Organization A strategically manages the boundaries between shared data and the trade secret to balance collaboration with protection of its most valuable intellectual property.
3.2 Code and algorithms
As mentioned above, code is the language used to write software programs, contains the implementation details of algorithms and can reveal crucial business information about how data is processed and utilized. Unless an open-source strategy is pursued, protecting the confidentiality of code and algorithms is paramount to prevent unauthorized individuals from understanding or reverse-engineering proprietary software in order to build and defend competitive edges over competitors. In practice, techniques such as code obfuscation, encryption, and strict access controls are applied to maintain the confidentiality of code (and the algorithms behind it) and to prevent unauthorized access or copying.
There are some industry-specific implications, but it is generally far less common to share code and/or algorithms between businesses than, for example, sets of processed data. This indicates and emphasizes the commercial value attributed to, and the level of secrecy applied to, code and algorithms and opens a primary playing field for digital data trade secrets.
Copyright is another form of intellectual property protection available to code and algorithms. However, it should be noted that certain jurisdictions do not permit an owner to assert both trade secret and copyright, especially if the copyrighted software discloses a majority of the source code or the “proprietary” portions.
4. Management of digital trade secrets
We have seen that digital objects may be protected by trade secrets (i.e., digital trade secrets) as long as they meet the eligibility criteria for such protection. The subsequent question is how the holders of digital trade secrets can properly manage them so that they can prove, in administrative and/or judicial proceedings, that the eligibility for trade secret protection has been met in an individual case.
The proper management of digital trade secrets involves defining and categorizing the information to establish its protected status, outlining the necessary measures to continuously safeguard its confidentiality, and developing a trade secret management lifecycle around each or each kind of trade secret. This section is tailored to specific challenges and opportunities surrounding effective management of digital trade secret assets. However, the descriptions relating to technology measures against disclosure and unauthorized use of trade secrets are also applicable to non-digital trade secret information in a digital format.
4.1 Identifying and selecting digital trade secrets
Businesses need to first identify and select specific digital information that qualifies as one or more trade secrets, and clarify which digital object or collection of digital objects defines each respective trade secret.
Capturing digital trade secret information
Assuming that the digital objects are tagged and can be specifically identified, the first step of capturing potential digital trade secret information is rather straightforward. It requires the creation, transfer or copying of the relevant digital objects into a dedicated file management system or structure to clearly separate them from non-relevant digital objects (e.g., into a trade secret management system on-premises, in a commercial or corporate cloud storage, encrypted IoT device, to specially designated and locked-away hard drives, or the like).
This step as such can be executed for an individual object or multiple objects at the same time. Batch-captures of multiple objects that require the same set of permission(s) facilitate the person capturing the information, as the metadata of the capturing process is shared across all objects that are recorded in one session and does not have to be provided (often manually) on a per-digital object basis. Under this approach, intake of potential digital trade secrets in a collection can be an automated process while the trade secret status designation is handled by specially trained personnel (e.g., Chief Trade Secret Officers, or trade secret professionals within the IP department) with their own stack of resources or with full automation via API gateway or other file transfer mechanism to a secure storage designation.
It should be noted that the initial capturing step can be done completely anonymously, or it can be combined with the capture by the trade secret creator. The latter may facilitate identifying employees to be rewarded under employee remuneration or incentivization programs.
Designating digital trade secrets
Once the potential trade secret information is captured, it is vital to designate digital trade secrets, taking into account the value of the trade secrets and their risks. By clearly defining their scope, businesses can better understand the level of protection required and the strictness of access control.
Example 1: Potential trade secret information is captured by an employee
Employee A uploads Documents 1, 2, 3 and 4 – all related to a specific financial algorithm, e.g., its design document, its user manuals and its source code.
Decision Maker B reviews the uploaded documents and (e.g., in cooperation with Employee A) designates Documents 1 and 3 as trade secrets that require corresponding access controls.
Decision Maker B electronically informs Employee A about the trade secret status of Documents 1 and 3, and the non-trade secret status of Documents 2 and 4.
Example 2: Potential trade secret information is captured by an IoT device
IoT device collects raw process equipment data – all related to a proprietary process.
IoT device uses an internal algorithm to compile the data and segregate the data that is useful for further downstream processing.
IoT device encrypts this segregated data, which is then securely communicated on a routine basis to a separate on-premises server or cloud storage for further use by data scientists. The data scientists are informed that the data in such location are considered trade secrets.
In general, in selecting information to be protected by trade secrets, an over-inclusive approach is preferred compared to a too restrictive selection, since the latter might entail the risk of losing control of information that may later turn out to be critical. However, excessively inflationary trade secret designation for any collected information without further review of trade secret eligibility should be avoided in order to: (i) keep the amount of digital trade secret information manageable and well-structured; and (ii) be able to prove that the digital trade secrets are managed in a finely differentiated categorization system and are not simply in a “file dump.” Otherwise, a court may not agree with the owner’s trade secret claim.
It is important to emphasize at this point that digital information which does not achieve trade secret status in this intermediate step can nevertheless be valuable as contextual information in transactions, since it can facilitate the implementation of trade secrets, e.g., as know-how, and be protected via commercial agreements.
4.2 Timestamping
One of the key advantages of digital trade secrets is the ability to timestamp them. Timestamping the contents of documents provides a way to establish the existence, the integrity and the possession of the contents at a specific point in time. Typically, timestamping involves a trusted third party or a centralized timestamping authority who assigns a unique timestamp to the document, which is then digitally signed by the authority, creating a verifiable proof of the document's existence at that time. To be precise, the utility of the timestamping is not limited to digital trade secrets but also extends to non-digital trade secrets in a digital format (e.g., a manufacturing process of chemical compound X described in a digital file).
For this purpose, some national or regional intellectual property offices established a service that provides a date- and time-stamped digital fingerprint of any file.
Blockchain
Blockchain technology can offer a decentralized and tamper-resistant alternative to timestamps from a centralized authority or service.
The decentralized nature of blockchain ensures that no single entity has control over the timestamps, making it difficult for anyone to manipulate the data. Additionally, the immutability of blockchain ensures that once a document is timestamped, it cannot be altered or removed without detection.
Blockchain-based timestamping also offers transparency, as the timestamped information becomes part of a public ledger (without disclosing the confidential information itself, see Section 4.3, below) that can be independently verified by anyone. This provides a high level of trust and accountability, as the integrity and authenticity of the document can be verified by multiple participants in the blockchain network. Moreover, blockchain-based timestamping systems often come with built-in mechanisms, such as consensus algorithms, to ensure the accuracy and consistency of the timestamps.
4.3 Measures against disclosure and unauthorized access
Capturing trade secret information in digital systems (be it trade secret management systems, cloud or on-premise data storage or timestamping services) can be the source of various security risks which can: (i) destroy the trade secret status of the captured information as a whole; or (ii) provide opportunities for trade secret misappropriation. Accordingly, the protection measures need to be addressed specifically with regard to digital trade secrets. To be precise, the various digital protection measures addressed in this section are also applicable to digital representations of any trade secrets.
Measures against the disclosure of digital trade secrets
One imminent risk when it comes to capturing trade secrets on digital systems (centralized or decentralized) is the risk of disclosing a trade secret accidentally to unauthorized persons or even to the public.
Many traditional trade secret management systems are intentionally run on computers without internet connection (e.g., lab computers) or reside on-premises on local servers of corporate customers with very strict access control even on the hardware side (e.g., no USB devices allowed, no other connectivity enabling data transfers like Bluetooth) and extensive security logs. However, current management systems are “always connected,” whether it be cloud storage, personal communication device (e.g. cell phone), or IoT device. Accordingly, digital trade secret capturing systems generally face a lot of security due diligence if they involve (especially external) cloud storage, communication links and/or blockchain integration to implementing technical safeguards and obtaining relevant standard certifications.
Two pillars of digital trade secret security are hashing and encryption. Both are cryptographic techniques used to protect data, but they serve different purposes and have distinct characteristics.
Hashing is a one-way process that converts data of any size into a fixed-length string of characters, known as a hash value or checksum, e.g., by hashing documents with a SHA (Secure Hash Algorithm) checksum to ensure data integrity and verify the authenticity of files. SHA checksums generate a fixed-length alphanumeric string that uniquely represents the contents of a document. When a document is hashed using SHA, any slight change in the file will result in a completely different checksum. This makes it virtually impossible to tamper with the document without altering the checksum. By comparing the computed SHA checksum of a document with the original checksum, one can quickly determine if the file has been modified or corrupted.
In the above example wherein trade secrets are timestamped using blockchain technology, the document itself is not disclosed to the ledger itself or stored on a digital file storage system (like the Interplanetary File System (IPFS)). Rather, the document hash (representing either an individual file or a collection of digital files – like .zip files) is permanently recorded on the ledger together with the relevant timestamp, effectively avoiding public disclosure of the confidential information while leveraging the benefits of the transparency of a public blockchain. As such, it is not possible to recreate the hashed document based on its checksum. However, conversely, it is possible to provide evidence that a document with a matching hash was in possession of the person (or blockchain wallet) to whom the timestamped hash can be attributed at the time when the timestamping occurred.
By hashing documents, information can be stored and hashed offline or on-premises while only the hash is recorded and timestamped online. Naturally, document retrieval from the digital system itself is not possible in these instances as only the hash/checksum is disclosed to the digital system.
Encryption, on the other hand, is a two-way process that converts data into a ciphertext using an encryption algorithm and a secret key. The primary purpose of encryption is data confidentiality. It ensures that data remains secure and unreadable to unauthorized individuals. Encryption allows the original data to be transformed into an encrypted form, and it can be decrypted back into its original form using the corresponding decryption algorithm and the correct key. This is how modern wireless communication devices operate when sending and receiving data.
In practice, timestamping, hashing and encryption can be combined, depending on the level and nature of individual confidentiality requirements that the trade secret holder wants to introduce.
Access-control measures
In addition, access-control measures should be put in place to prevent unauthorized access, disclosure or theft of the trade secret information from digital systems. A minimum standard for such access controls is 2FA (Two-Factor Authentication), a security measure designed to add an extra layer of protection to user accounts and systems by requiring users to provide at least two forms of identification or credentials during the authentication process.
2FA, as the name suggests, utilizes two factors for authentication. Typically, these factors include something the user knows (such as a password or PIN) and something the user possesses (such as a mobile device or security token). When logging in, users enter their password as the first factor and then provide the second factor, often a temporary code generated by an authenticator app or received via SMS.
Depending on the required level of protection, MFA (Multi-Factor Authentication) can expand on the concept of 2FA by incorporating additional factors beyond the two mentioned above. These additional factors can include something the user is (biometric data, such as fingerprints or facial recognition) or something the user has (such as a physical smart card or a registered device). By combining multiple factors, MFA provides an even higher level of security and reduces the risk of unauthorized access.
Another option for access controls and security is the institution of a secure enclave, which segregates a database or memory portion with enhanced security controls. This can be done on most storage devices and databases (e.g. laptop, server, mobile).
4.4 Interoperability
One additional point that should be mentioned in the context of capturing and designating digital trade secrets is the potential requirement for interoperability. Interoperability refers to the ability of different systems or technologies to seamlessly work together and exchange information.
In the context of a trade secret capturing solution, interoperability is essential for ensuring that the solution can accommodate future transactions, even if those transactions were not initially anticipated. The capturing solution designed with interoperability in mind is capable of integrating and interacting with other systems, platforms or protocols in the future, if necessary. This foresight allows the solution to support various transactions, such as transfers (sale or intra-company transfers after M&A deals), exchanges or smart contract interactions, regardless of the specific ecosystem or technology they operate on.
A lack of interoperability can result in several disadvantages, including increased transaction costs, as businesses may need to employ multiple platforms to handle different aspects of their trade secret operations. In addition, difficulties of validating/proving unsynchronized timestamps can arise, which may make it more challenging to maintain accurate and consistent trade secret capturing records.
5. Digital trade secrets and large language models
The emergence of large language models, such as GPT-4 (Generative Pre-trained Transformer 4), has revolutionized natural language processing and generated new opportunities and challenges regarding trade secret protection. These advanced models have the capability to analyze and generate human-like text, making them valuable tools for various applications, including content generation, customer service automation and data analysis. However, businesses must navigate the delicate balance between leveraging the advantages of large language models internally while protecting their trade secrets from unauthorized disclosure. One recent case that made the press in this context involved Samsung employees allegedly leaking confidential data, such as the source code itself for a new program, internal meeting notes and data relating to hardware whilst using ChatGPT to help them with tasks.
To protect trade secrets while utilizing large language models (LLM) internally, businesses should consider adopting various strategies:
Focus on safeguarding the specific inputs or proprietary data. By keeping confidential information within their control and limiting access to authorized individuals, businesses can lower the risk of exposing sensitive trade secrets to the LLM. This, as outlined above, may involve implementing access controls, encryption and mechanisms to monitor access to the large language model.
Adopt techniques such as data masking or data obfuscation to prevent the direct exposure of proprietary information to the model. By modifying or anonymizing certain aspects of the data before inputting it into the model, businesses can maintain the confidentiality of trade secrets while still benefiting from the model's language processing capabilities. Careful consideration should be given to the selection and treatment of data to strike a balance between utility and confidentiality.
Establish clear policies and agreements with employees and contractors involved in utilizing large language models. Non-disclosure agreements (NDAs) and confidentiality clauses can outline the responsibilities and obligations of individuals to ensure the protection of trade secrets. Employees should be trained on the importance of maintaining confidentiality, data security best practices and the risks associated with unauthorized disclosure.
Consider a private instance of the LLM (albeit for a fee), where there are contractual safeguards with the LLM vendor regarding use and destruction of proprietary data that is uploaded to the model. Note that OpenAI has such a product that allows individuals or corporations to use GPT-4 on a private basis through APIs.
In conclusion, as large language models like GPT-4 become more prevalent, businesses need to strike a balance between leveraging their capabilities and protecting trade secrets. By doing so, businesses can continue to innovate, improve operational efficiency and maintain their competitive edge in an era of advanced language processing technologies.
6. Challenges and risks in protecting digital trade secrets and mitigation strategies
Digital trade secrets can be exposed to a range of specific potential security challenges and risks. This section addresses the most common challenges and risks – and how they can be mitigated at a high level. It may also be relevant to non-digital trade secret information in a digital format.
The general operational and contractual mitigation measures against trade secret leakage and misappropriation were explained in Part IV: Trade secret management. As illustrated in Part IV, the importance of setting up institutional decision-making structures, document management, logistical measures, education and training of employees, IT measures and contracts with employees and external partners is also applicable to the protection of digital trade secrets.
In essence, once digital trade secrets are leaked or misappropriated, there is a high risk that they cannot be fully recovered. Therefore, prevention against disclosure and unauthorized access to digital trade secrets in the first place should be the priority of any trade secret holders.
6.1 Vulnerability to theft, cyber-attacks and data breaches
In the digital age, protecting digital trade secrets presents challenges and risks, particularly in terms of vulnerability to theft, cyber-attacks and data breaches. These threats pose a considerable risk to the confidentiality and integrity of valuable proprietary information, potentially leading to severe financial and reputational consequences for businesses.
One of the primary challenges in protecting digital trade secrets is the heightened vulnerability to theft, since they can be easily copied, shared and disseminated. Cyber-attacks by sophisticated hackers and cybercriminals can pose another significant risk to the protection of digital trade secrets. In addition, data breaches (exposure of sensitive information by unauthorized individuals or hackers who gained access to a company's digital infrastructure) can result in a loss of secrecy and of the entire value of the trade secrets.
Robust security measures, including regular updates and incident response plans, can be implemented to mitigate the risks. At the same time, security measures should be also at the reasonable level, similar to any other measures for trade secret protection. The value of the trade secret versus the cost of trade secret protection and the feature of the organization may also need to be taken into account (see Part IV, Section 2.3).
Already highlighted in Part IV, digital trade secrets are also susceptible to a high risk of disclosure or misappropriation by current and former employees or by external collaborators and business partners where trade secret information is shared with them. In the digital technology and digital service sectors, global employee mobility, outsourcing arrangements and utilizing offshore resources are part of the daily business in many organizations, which heighten the risk.
To mitigate the risk, the implementation of access controls on a need-to-know basis, robust contractual measures, education and training, and exit and inbound interviews are important not only for prevention of misappropriation but also for avoiding contamination with trade secrets held by others (see Part IV, Sections 3.1 and 5.1).
6.2 Exposure during audits
Internal and external audits play a crucial role in ensuring compliance, identifying operational efficiencies and assessing financial performance. However, the process of conducting audits also requires auditors, even when bound by non-disclosure agreements, to have access to confidential information of the businesses to evaluate their financial statements, internal controls and compliance with regulations. This sharing of information raises the risk of trade secret misappropriation or accidental disclosure to unauthorized individuals.
To mitigate the risk of digital trade secret exposure during audits, businesses should establish robust confidentiality agreements with auditors, explicitly outlining the scope of information they are authorized to access and defining their obligations regarding trade secret protection. This agreement should also include provisions for the return or destruction of any trade secret information obtained during the audit process. Additionally, implementing technological safeguards, such as data encryption, access controls and data trails, can further protect digital trade secrets during audits.
6.3 Retrieving and regaining control of digital trade secret data
Due to their digital availability, once digital trade secrets have been misappropriated or used without authorization, retrieving and regaining control over that information becomes a daunting task.
Organizations can take certain measures to address this issue and attempt to mitigate the potential damage. Besides the “traditional” approach to recover digital trade secrets through legal recourse, businesses may consider leveraging technology and digital forensics to track and retrieve digital trade secrets. This might necessitate working with specialized cybersecurity firms or forensic experts to trace the unauthorized use of trade secrets, identify the locations or systems involved and attempt to regain control over the information. The process may involve employing advanced data analysis techniques, monitoring networks or utilizing forensic tools to trace the movement and storage of the trade secret data.
The success of these efforts largely depends on the sophistication of the unauthorized user, the extent of their activities, and the availability of digital evidence.
Legal and technological measures to retrieve and regain control over the trade secret information may not always guarantee a full recovery. Therefore, it can be only reiterated that prevention against disclosure and unauthorized use of digital trade secrets through robust security and contractual measures, employee training etc. remains crucial.
7. Trade secrets vs. other intellectual property rights for digital objects
7.1 Digital objects: trade secrets vs. patents
As explained in Part III: Basics of trade secret protection, most corporations use a strategy of combining patent protection and trade secret protection, considering the advantages and disadvantages of each protection mechanism. In this section, we briefly look into certain aspects that are particularly relevant to digital objects.
Patentability of digital objects
In general, according to patent laws of many countries, data as such, software code as such and mere presentation of information as such, are not considered as inventions that are eligible for patent protection. Similarly, abstract ideas, mathematical methods as such, as well as business or commercial methods as such, are not patent-eligible subject matter in many countries. However, it is not always easy for innovators to draw a clear line between these subjects excluded from patent protection and patentable software- or computer-implemented inventions.
In addition, due to the differences as to how national patent laws and practices regarding the patent eligibility and patentability criteria are applied to digital objects, even if they are seemingly minor, patent applicants may need to tailor their patent applications to meet specific national requirements, which might add complexity and a higher risk of rejections of patent applications.
Such unclarity about the availability of patent protection for these inventions makes it more difficult for innovators in the digital technology and service sectors to decide whether they should protect their creations under the patent system or the trade secret system.
Challenges in software-implemented patent litigation
Evidence of use
With respect to software-implemented inventions, most patent owners do not have direct evidence of infringement at the time of filing a lawsuit, because such direct evidence requires access to the source code of the alleged infringer. Rather, they claim in good faith that a defendant may be infringing the patent, because the outward functionality of the alleged program is similar to the claimed invention. Obtaining the direct evidence is not easy. In some countries, robust discovery procedures permit patent owners to review source code during litigation. Because most defendants consider the source code a trade secret, third-party “escrow” agents review the source code of the alleged infringer. It works in such a way that the third party accesses a static copy of the code and manages access (usually done on-site to restrict printing or copying) of the patent owner and/or its experts.
Territoriality
As patents are strictly territorial intellectual property rights, if the alleged infringement of certain elements (but not all) of the claimed invention was geographically distributed among different countries, it can give rise to an array of questions during patent litigation, for example:
if the data is fragmented and stored in multiple locations (in cloud or on-premises)
if infringement takes place over multiple servers spanning more than one country
if data is exported to a country with no patent protection options for data “processing,” and the processing results are subsequently re-imported for commercial exploitation
if unauthorized personnel are relocating data (such as on a USB drive) into countries where no patent protection is available or was sought.
These are just some of the challenges when enforcing patents on digital objects. Unfortunately, there is no single answer to address the above questions. If trade secret protection is pursued (potentially also in a mixed strategy together with patents), robust security measures (e.g., 2FA, encryption, breach detection) are the current, best solution to minimize the risks. If any misappropriation or infringement is identified, one should swiftly act, especially if such identification occurred in a territory with a robust rule of law.
Equitable remedies against patent infringement
The availability of equitable remedies, such as injunction, also varies by jurisdiction. If an injunction is issued, the defendant can usually change a few lines of code or rearchitect a database (sometimes easier said than done) to get around the injunction. This can result in an endless game of “whack-a-mole” with the defendant.
7.2 Digital objects: trade secrets vs. copyright
Copyright should be given careful consideration depending on the type of digital objects being protected. For example, copyright protection may be the best solution for audio and video recordings, especially if such recordings are an original work of authorship and/or will be widely distributed.
7.3 Digital objects: trade secrets vs. contract rights
Trade secrets and contract rights should be considered together. In fact, if a third party generates the trade secret data on behalf of the legal owner, there must be some contract in place to establish “reasonable means of protection.” When claiming trade secret protection, the trade secret holder most often points to some type of contractual arrangement (such as non-use and/or non-disclosure agreement) with the alleged misappropriator. If the holder fails in asserting trade secret misappropriation, they can most often rely on basic breach of contract as the remedy.
The holder, however, must correctly and separately plead both causes of action. Otherwise, the tribunal may find that if the data is not subject to trade secret, it is also not subject to confidentiality or non-use restrictions. This is why it is paramount that trade secret data is managed separately from mere confidential or proprietary data.
7.4 Mixed protection strategies for digital data
Based on the above, a combination of patent, trade secret and contractual rights is likely the best strategy for technological innovation. Patent protection may be best suited for a unique IoT device, communication protocol or data storage used to collect and transmit the digital data. Trade secret protection is best suited for the algorithm and data itself (raw and processed). Contractual rights are needed if a third party is involved in the collection, processing or sharing of the data. Such a triple approach gives the data owner a wide array of enforcement options.