Tag: content creation

Questions are being raised about generative artificial intelligence

What is AI

Artificial intelligence is about use of machine learning and algorithms to analyse data in order to make decisions on that data. It is more so about recognising and identifying patterns in the data presented to the algorithm based on what it has been taught.

This is primarily used with speech-to-text, machine translation, content recommendation engines and similar use cases. As well, it is being used to recognise objects in a range of fields like medicine, photography, content management, defence, and security.

You may find that your phone’s camera uses this as a means of improving photo quality or that Google Photos uses this for facial recognition as part of indexing your photos. Or Netflix and other online video services use this to build up a “recommended viewing” list based on what you previously watched. As well, the likes of Amazon Alexa, Apple Siri or Google Assistant use this technology to understand what you say and create a conversation.

What is generative AI

Generative artificial intelligence applies artificial intelligence including machine learning towards creating content. Here, it is about use of machine learning, typically from different data collections, and one or more algorithms to create this content. It is best described as programmatically synthesising material from other material sources.

This is underscored by ChatGPT and similar chatbots that use conversational responses to create textual, audio or visual material.  This is seen as a killer app for generative AI. But using a “voice typeface” or “voice font” that represents a particular person’s voice for text-to-speech applications could be a similar application.

Sometimes generative AI is used as a means to parse statistical information in to an easy-to-understand form. For example, it could be about an image collection of particular cities that is shaped by data that has geographic relevance.

The issues that are being raised

Plagiarism

Here, one could use a chatbot to create what apparently looks like new original work with material from other sources without attributing the content creators for the material that existed in these sources.

Nor does it require the end-user to make a critical judgement call about the sources or the content created or allow the user to apply their own personality to the content.

This affects academia, journalism, research, creative industries and other use cases. For example, education institutions are seeing this as something that impacts on how students are assessed, such as whether the classic written-preferred approach is to be maintained as the preferred approach or to be interleaved with interview-style oral assessment methods.

Provenance and attribution

It can also extend to identifying whether a piece of work was created by a human or by generative artificial intelligence and identifying and attributing the original content used in the work. It also encompasses the privacy of individuals that appear in work like photos or videos; or where personal material from one’s own image collection is being properly used.

This would be about, for example, having us “watermark” content we create in or export to the digital domain and having to identify how much AI was used in the process of creating the content.

Creation of convincing disinformation content

We are becoming more aware of disinformation and its effect on social, political and economic stability. It is something we have become sensitised to since 2016 with the Brexit referendum and Donald Trump’s election victory in the USA.

Here, generative artificial intelligence could be used to create “deepfake” image, audio and video content. An example of this being a recent image of an explosion at the Pentagon, that was sent around the Social Web and had rattled Wall Street.

These algorithms could be used to create the vocal equivalent of a typeface based on audio recordings of a particular speaker. Here, this vocal “typeface” equivalent could then be used with text-to-speech to make it as though the speaker said something in particular. This can be used as a way to make it as though a politician had contradicted himself on a sensitive issue or given authority for something critical to occur.

Or a combination of images or videos are used to create another image or video that depicts an event that never happened. This can involve the use of stock imagery or B-roll video mixed in with other material.

Displacement of jobs in knowledge and creative industries

Another key issue regarding generative artificial intelligence is what kind of jobs this technology will impact.

There is a strong risk that a significant number of jobs in the knowledge and creative industries could be lost thanks to generative AI. This is because the algorithms could be used to turn out material, rather than having people create the necessary work.

But there will be a want in some creative fields to preserve the human touch when it comes to creating a work. Such work is often viewed as “training work” for artificial-intelligence and machine-learning algorithms.

It may also be found that some processes involved in the creation of a work could be expedited using this technology while there is room to allow for the human touch. Often this comes about during editing processes like cleaning-up and balancing audio tracks or adjusting colour, brightness or contrast in image and video material with such processes working as an “assistant”. It can also be about accurately translating content between languages, whether as part of content discovery or as part of localisation.

There could be the ability for people in the knowledge and creative industries to differentiate work between so-called “cookie-cutter” output and artistic output created by humans. This would also include the ability to identify the artistic calibre that went in to that work.

The want to slow down and regulate AI

There is a want, even withing established “Big Tech” circles, to slow down and regulate artificial intelligence, especially generative AI.

This encompasses slowing down the pace of AI technology development, especially generative AI development. It is to allow for the possible impact that AI could have on society to be critically assessed and, perhaps, install “guardrails” around its implementation.

It also encompasses an “arms race” between generative-AI algorithms and algorithms that detect or identify the use of generative AI in the creation of work. It will also include how to identify source material, or the role generative AI had in the work’s creation. This is because generative AI may have a particular beneficial role in the creation of a piece of work such as to expedite routine tasks.

There is also the emphasis on what kind of source material the generative AI algorithms are being fed with to generate particular content. It is to remind ourselves of the GIGO (garbage in, garbage out) concept that has been associated with computer programming where you can’t make a silk purse out of a sow’s ear.

What can be done

There has to be more effort towards improving social trustworthiness of generative AI when it comes to content creation. It could be about where generative AI is appropriate to use in the creative workflow and where it is not. This includes making it feasible for us to know whether the content is created by artificial intelligence and the attribution of any source content being used.

Similarly, there could be a strong code of ethics for handling AI-generated content especially where it is used in journalism or academia. This is more so where a significant part of the workload involved in creating the work is contributed by generative AI rather than it being used as part of the editing or finishing process.

USB microphones or traditional mics for content creation?

Blue Yeti Nano USB microphone product image courtesy of Logitech

Blue HYeti Nano – an example of a USB microphone pitched at podcasters

Increasingly as we create and post content online, we are realising that microphones are becoming a valuable computer accessory for recording or broadcasting our voices or other live sound. This is more so where we are making podcasts or videos or even streaming video games with our own commentary, with this kind of content creation becoming a viable cottage industry in its own right.

Even videoconferencing with Zoom and similar software has had us want to use better microphones so we can be heard clearly during these videocalls. This was important while stringent public health measures were in place to limit the spread of the COVID coronavirus plague but is now coming in to play with hybrid (online and face-to-face) work and education settings that we are taking advantage of.

What we are realising is that the integrated condenser microphone in your laptop computer or Webcam isn’t really all that up-to-scratch for this kind of content creation. This is similar to the days of the cassette recorder where people who aspired to make better live recordings stopped using their tape recorder’s built-in microphone and used a better quality external microphone.

But there are two ways of connecting an external microphone to your computer – USB port or a traditional microphone input.

USB microphone

Lenovo Yoga 2 Pro convertible notebook Right-hand side - Power switch, Volume buttons, 3.5mm audio jack, USB 2.0 port

The USB port on most regular computers is what you would plug a USB microphone into for plug-and-play recording

The USB microphone has at least one microphone element directly connected to an integrated audio interface. This converts the sound picked up by the microphone into a digital form useable by the host computer.

Some of these microphones have an audio-output function which feeds a headphone jack so you can monitor what you are recording or broadcasting with a set of headphones. You may even find that some USB microphones have a microphone-level analogue audio output so you can connect them to a traditional audio device rather than just a computer.

All of the USB microphones present to the host computing device as a standard USB Audio input device with those with headphone outputs also presenting the headphone jack as a standard USB Audio output device. This means that the USB Audio class drivers supplied with your computer’s operating system are used to enable these microphones without the need for extra software to be installed on the computer.

An increasing number of manufacturers will often supply audio-processing software that performs equalisation, level control or dynamic-range control on the host computer. Or the digital-audio recording software that you use on your computer will be able to do this function for you. All of this audio processing happens in the digital domain using your computer’s CPU or GPU.

The integrated audio interface allows designers of these USB microphones to set up a sophisticated array of multiple microphone elements in these microphones. This would allow for them to work as one-point stereo microphones or use microphone-array techniques to determine their sensitivity or pickup pattern. You may find that you determine how these sophisticated microphones operate through manufacturer-supplied software or perhaps a hardware switch on the microphone.

Traditional microphone

Behringer UlltraVoice XM8500 microphone product image courtesy of Behringer

The Behringer UltraVoice XM8500 microphone – an example of a traditional microphone

The common traditional microphone makes the sound that it picks up available as a low-level analogue signal. They are designed to be connected to an amplifier, recording device, mixing desk or other audio device that has an integrated microphone amplifier circuit.

This would be either a balanced or unbalanced signal depending on whether the microphone is for professional or consumer use. It is although most value-priced professional-grade mono dynamic microphones typically pitched for PA and basic recording use can work as balanced or unbalanced mics. That is thanks to the mic’s cable connected to the mic itself via an XLR plug even though the cable would plug in to the equipment using a 6.35mm mono phone plug.

There are electret-condenser microphones that work in a different way to the common dynamic microphone but these are dependent on a power source. This is typically provided by a battery that is installed in the microphone or through the associated equipment offering “phantom power” or “plug-in power” to these microphones via their cable.

If you use a traditional microphone with your computer, you would need to use an audio interface of some sort. The traditional sound card installed in a desktop computer or some basic USB audio interfaces that you use with your laptop computer would offer a 3.5mm phone-jack microphone input which would be mono (2-conductor) at least or may be stereo (3-conductor) so you can use a one-point stereo mic. These could work well with a wide range of microphones that have this connection type, typically those pitched at portable-recorder or home-video use.

Then the better USB audio interfaces would offer either at least one microphone input in either a 6.35mm phone jack or three-pin XLR socket, most likely offering a balanced wiring approach. You can still use a mic that has a 3.5mm phone plug if you use an adaptor that you can buy from an electronics store.

Shure X2U USB audio interface product image courtesy of Shure

Shure X2U USB audio interface that plugs in to the XLR socket on a common traditional microphone

Let’s not forget that a significant number of microphone manufacturers offer USB audio interfaces that plug in to their microphone’s XLR socket. These adaptors such as the Shure X2U are powered by the host computer USB interface and, in a lot of cases, provide the “phantom power” needed by electret-condenser microphones.

It is also worth noting that the better quality USB audio interfaces will do a better job at the sound-handling process and will yield a high-quality signal. This is compared to the audio interface in your laptop computer or Webcam, or baseline soundcards and USB audio modules which may not make the mark for sound quality.

For a long time there have been traditional one-point stereo microphones but most of them have been pitched at hobbyist or consumer use with stereo tape recorders. Most such microphones use a hardwired cable with a 3.5mm stereo phone plug or a 5-pin standard DIN plug if the recorder has a stereo microphone socket, or two 6.35mm or 3.5mm mono phone plugs if it has a pair of mono microphone sockets. But some professional stereo microphones have a 5-pin XLR or Neutrik connection and come with a breakout cable that has two XLR plugs to connect to a pair of microphone inputs.

What microphone type suits your application better

A USB microphone is valuable for laptops or small desktop computers and is only intended where you are using the software on your computing device to record or broadcast.

You may end up getting more “bang for your buck” out of a USB microphone purchase due to the integrated audio-interface design that they have. This may be of value to people starting out in podcasting or similar audio-recording and broadcasting tasks and want a low-risk approach. As well, you may find them easy to set up and use with your computer especially where the microphone relies on class drivers supplied by the operating system rather than proprietary driver software.

USB microphones are considered to be more portable because you don’t need to carry a USB audio interface with you when you intend to record “on the road” with your computer.

Another advantage is that you have a very short low-level unbalanced analogue audio link between the microphone elements and the signal-processing electronics, This means that you end up without the risk of AC hum or other undesirable noise getting in to your recording due to a long unbalanced low=level audio link.

You may find it difficult to use a USB microphone with a digital camera or camcorder. This is because not many of them provide USB Audio device support for microphones and similar devices and they may not eve have a host-level USB connection for any peripherals. Similarly, you may find it difficult to use them with most mobile-platform devices because of the way some versions of iOS or Android handle them.

A traditional microphone with a common connection type excels when it comes to versatility. This is more important where you intend to use them with a wide range of audio devices like recording equipment or mixing consoles. Similarly they excel when it comes to microphones that have particular sensitivity and audio characteristics out of the box.

It also comes in to its own when you want to record with a tape recorder or other standalone recording device to assure recording reliability. This use case includes the use of external microphones with your video equipment to have better sound on your video recordings.

Some users may find that connecting traditional mics to their computer via a mixing console of some sort may give them better hands-on control over how their recordings or broadcasts will sound. Here, you may find that some of the newer mixing consoles are likely to have their own USB audio interface to connect to a computer especially if they are more sophisticated. As well, some users who have used mixing desks or standalone recording devices frequently will find themselves at ease with this kind of setup. This is because these devices offer the ability to adjust the sound “on the fly” or mix multiple microphones and audio sources for a polished recording or broadcast.

Conclusion

A cardinal rule to remember is that you will end up having to spend a good amount of money on a good-quality microphone if you are wanting to make good-quality recordings or online broadcasts. No digital processing can make a silk purse out of a sow’s ear when it comes to audio recording.

Here, the USB microphone will come in to its own if you are just using a computer. On the other hand, a good-quality traditional microphone used with a USB audio interface could answer your needs better if you want pure flexibility.

Dell jumps on the prosumer bandwagon with the XPS Creator Edition computers

Articles

Dell XPS 17 laptop press picture courtesy of Dell Australia

Dell is offering variants of the latest XPS 17 desktop-replacement laptop that will be pitched at prosumers and content creators

What is Dell’s XPS 17 ‘Creator Edition?’ | Windows Central

Dell Reveals Redesigned XPS 15 and Powerful New XPS 17 Aimed at Creators | Petapixel

Dell’s new XPS Desktop looks to be a premium powerhouse PC | PC World Australia

From the horse’s mouth

Dell

XPS 17 Series (USA product page with Creator Edition packages)

XPS Desktop series (USA product page with Creator Edition packages)

NVIDIA

RTX Studio program (Product Page)

My Comments

As I have previously reported, computer-equipment manufacturers are waking up to the realisation that prosumers and content creators are a market segment to address. This group of users was heavily courted by Apple with the MacOS platform but Windows-based computer vendors are answering this need as a significant amount of advanced content-creation and content-presentation software is being written for or ported to Windows 10.

Here, the vendors are shoehorning computer specifications for some of their performance-focused computers towards the kind of independent content creator or content presenter who seeks their own work and manages their own IT. This can range from hobbyists to those of us who create online content to supplement other activities towards small-time professionals who get work “by the job”. It can also appeal to small-time organisations who create or present content but don’t necessarily have their own IT departments or have the same kind of IT department that big corporations have.

Lenovo answered this market with a range of prosumer computers in the form of the Creator Series which encompassed two laptops and a traditional tower-style desktop. Now Dell is coming up to the plate with their Creator Edition computer packages. Here, this approach is to have computers that are specifiied for content creation or content presentation but aren’t workstation-class machines identified with a distinct “Creator Edition” logo.

The first of these are the Creator Edition variants of the latest Dell XPS 17 desktop-replacement laptop. These have, for their horsepower, an Intel Core i7-10875H CPU and a discrete GPU in the form of the NVIDIA GeForce RTX-2060 with 6Gb display memory, based on the NVIDIA Max-Q mobile graphics approach. This will run RTX Studio graphics drivers that are tuned for content-professional use and will be part of the RTX Studio program that NVIDIA runs for content professionals.

The display used in these packages is a 17” 4K UHD touch display that is rated for 100% Adobe RGB colour accuracy. The storage capacity on these computers is 1 Terabyte in the form of a solid-state disk. The only difference between the two packages is that the cheaper variant will run with 16Gb system RAM and the premium variant having 32Gb system RAM.

Dell is also offering a Creator Edition variant of its XPS-branded desktop computer products. This will be in the form of a traditional tower-style desktop computer but is equipped with the latest Intel Core i9 CPU, NVIDIA GeForce RTX 2070 Super graphics card and able to be specced with RAM up to 64Gb and storage of up to 2Tb. It has all the expandability of a traditional form-factor desktop computer, something that would come in handy for project studios where special audio and video interface cards come in to play.

What is being shown up here is that computer manufacturers are recognising the content-creator and prosumer market segment who wants affordable but decent hardware that can do the job. It will be interesting to see who else of the large computer manufacturers will come up to the plate and have a product range courting the content creators and prosumers.

Microsoft researches a way to consolidate recordings from multiple recording devices

Article – From the horse’s mouth

Microsoft Research

Abstract

Detailed article – PDF

My Comments

Sports scoreboard app

Microsoft is working on a way to create better recordings from many smartphones and audio recorders recording the same event

Microsoft has completed some research on how to amalgamate audio recordings of a meeting that were captured by different recording devices to turn out a higher-grade recording that captures the whole of a meeting. It is seen as being the audio equivalent of experiments and projects that aggregate multiple camera views of the same object, or could be seen as a way to create a “Claytons microphone array” using multiple recording devices with their own microphones.

The technique involves the creation of audio fingerprints of each of the recordings in a similar vein to what Shazam and its allies do to “name that song”. But these fingerprints are used to match the timing of each of the recordings to identify what was commonly recorded, allowing for the fact that one could start or stop a recording device earlier or later than another person.

This can lead to TV-grade multi-camera video recordings from a combination of DSLRs, high-end cameras and camcorders used by different users

This can lead to TV-grade multi-camera video recordings from a combination of DSLRs, high-end cameras like this one…

The technology that is assumed to be used in this context are standalone file-based digital notetaker recorders or the audio-recording function incorporated in many a smartphone or tablet typically by virtue of an app. Typically these recorders are recording the same event with integrated microphones and implementing automatic gain control and, in some cases, picking up their “own” background noise.

But you could extend this concept to integrating audio recordings made on legacy media like audio tape using standalone devices, or the soundtracks of video recordings recorded during the same event but are subsequently “dubbed” to audio files to be used in the recording. A good example could be someone who uses a “shoebox” or handheld cassette recorder to make a reliable recording of the meeting using something they are familiar with; or someone videoing the meeting using that trusty old camcorder.

Sony FRD-AX33 4K HandyCam camcorder press picture courtesy of Sony America

… and camcorders like this one of special events.

There are plans to create further research in to this topic to cater for recording music such as when the same concert performance or religious service is recorded by two or more people with equipment of different capabilities.

A good question to raise from the research is how to “time-align” or synchronise a combination of audio and video recordings of the same event that were recorded at the same time with equipment that has different recording capabilities. This is without the need to record synchronisation data on each recording device during production, and allowing for the use of equipment commonly used by consumers, hobbyists / prosumers and small organisations.

The reality that can surface is someone records the event using top-shelf gear yielding excellent audio while others film from different angles using camcorders, digital cameras and smartphones that record not-so-good sound thanks to automatic gain control and average integrated mics, while the good digital cameras and camcorders still implement their excellent optics and sensors to capture good-quality vision.

Once this is worked out, it could then allow a small-time video producer or a business’s or church’s inhouse video team to move towards “big-time” quality by using top-shelf audio gear to catch sound and the use of one or two camcorders operated by different operators to create “TV-studio-grade” multi-camera video.

Who knows whether the idea of post-production audio-level synchronising and “blending” for both conference recordings and small-time video producers.