On-device or on-premises artificial intelligence–what does it offer
A significant direction for artificial intelligence, especially advanced AI like generative AI, is to have the AI processes performed on the same device or within the same premises as the users who will benefit from it.
This can be considered as part of edge computing because it involves the pre-processing of data before it is sent to a cloud-driven AI platform or post-processing of data coming back from a cloud-driven AI platform.
What is desireable about this is energy efficiency for cloud-based AI, reduced data transfer requirements or assurance of user privacy, corporate confidentiality and data sovereignty due to the minimum amount of data processed in an online environment. Apple even takes this further by running a private cloud specific to each Apple platform user to cater for more intense processing that can’t be performed on the device itself.
On-device and on-premises AI relies primarily on a smaller language model compared to the large language models that cloud-based AI services like ChatGPT rely on. Here they are focused on the data that exists on or is likely to come in to the machine or the logical network. Here, this cam allow for improved data management or permit a custom language model that represents personal or corporate desires.
Why on-device and on-premises AI
A key desire is data security and end-user privacy. Here the data never leaves the device or premises for artificial-intelligence / machine-learning processing. This satisfies business and industry compliance expectations like privacy, corporate confidentiality and data sovereignty requirements.
Another benefit is improved performance and personalisation when it comes to artificial-intelligence processing and machine learning. Here, the processing takes place on a local machine thus avoiding the use of oversubscribed cloud computing services that can underperform under load. The AI language model ends up being highly personalised thus becoming lean.
There is reduced energy consumption compared to sending the data out to cloud-computing data centres. You also see efficient use of telecommunications links due to smaller amounts of data being sent using them as well as seeing efficient use of cloud-computing services. Another key benefit to see is improved service resilience because you aren’t heavily dependent on online resources – what if the network link fails.
Hybrid (cloud+on-device / on-premises) AI setups can allow for more sophisticated artificial-intelligence / machine-learning processing and working with multiple custom environments. This is due to having a lot of the data handling done locally before it is submitted or after results are received.
On-device AI processing
On-device AI is about having the AI data handled by the same device that is to make use of the data. This is facilitated through either a third processor called a neural processing unit (NPU) or a very powerful general processor that sets aside processor cores for neural processing to answer AI tasks.
A good analogy to think of are some NAS units that have a graphics processor in addition to their primary CPU. Here, these devices use the graphics processor for accelerated datatype translation like converting multimedia files in to other formats. or similar processing tasks.
There will also be an expectation to have a lot of RAM and storage capacity on these devices. This is something that is being answered easily thanks to Moore’s Law where cost of increased storage and RAM is being reduced significantly.
Such setups can be facilitated either on regular computers or mobile computing devices like smartphones and tablets.
On-premises AI processing
The on-premises AI approach would rely on a server or NAS on the same logical network as the end-users to process the AI data. This would come in to its own with on-premises or hybrid cloud computing setups where the desire is to keep the important data on the user’s premises.
This could represent a server or NAS that uses artificial intelligence and machine learning to make sense of a data set stored therein; or a server or NAS could perform AI tasks for client computers that don’t have on-device AI abilities. This can even lead to the creation of local chatbots that supply answers based on locally-held organisational data.
The trends associated with on-device and on-premises AI
2024 has effectively become the year of general-purpose on-device AI processing with both the mobile-platform devices that run mobile operating systems and the regular computers that run desktop operating systems.
Some of the premium Android smartphones, tablets and smartwatches from the likes of Samsung and Google that are introduced in 2024 are being equipped with AI functionality. These implement Qualcomm Snapdragon mobile ARM64 processors and use this technology for voice-to-text, advanced search, machine translation, photo editing and similar functionality. Apple is introducing this kind of on-device AI to their latest iPhones and iPads powered with their latest silicon as part of Apple Intelligence, their branding of on-device AI. Here, this offers AI-driven inbox management, document and recording summarisation, image editing, AI-driven emojis and similar functions.
As well, during this year, Microsoft built in to Windows 11 on-device AI functionality which comes alive on computers that have neural-processing units. This is marketed as CoPilot+ and is being offered on laptop computers that use Qualcomm Snapdragon X (ARM64) silicon or, shortly, Intel Lunar Lake Core Ultra and AMD Strix Point (IA-64) silicon. These offer video transcription and captioning, image creation and editing, video editing, document summarisation amongst other things. This has been underscored by a deluge of CoPilot+ AI-capable laptops being launched or given their first outing at the Internationaler Funkaustellung 2024 in Berlin with some of the units equipped with Intel silicon and others with Qualcomm Snapdragon X silicon.
Apple is also offering a similar kind of artificial intelligence for the latest Macintosh computers with the latest Apple M-series silicon. This will offer the same kind of features as their iOS and iPadOS implementations but with a richer interface. For all the Apple operating systems, there is support for hybrid ChatGPT operation with a “private cloud” arrangement to protect users’ data.
QNAP and Synology are working on equipping newer NAS units and newer versions of the NAS operating systems for artificial intelligence with AI being seen as part of a NAS’s feature set. But this will primarily be about managing or indexing data held on these devices themselves but someone even prototyped a NAS-based local ChatGPT setup as a proof of concept about on-premises generative AI setups which would then be about secure AI operations.. There will be the idea of using business or enthusiast grade NAS units as part of edge-computing setups to permit pre-processing of data before submitting to cloud-based AI.
Conclusion
On-device and on-premises artificial intelligence including hybrid setups such as edge-based AI or private cloud AI is expected to be a key turning point for this technology. This will most likely be due to a call for secure private and bespoke data handling requirements coming about and to keep generative AI technology relevant for most users.