T 1591/13 () of 7.3.2019

European Case Law Identifier: ECLI:EP:BA:2019:T159113.20190307
Date of decision: 07 March 2019
Case number: T 1591/13
Application number: 05793444.0
IPC class: H04N 17/00
H04N 17/04
Language of proceedings: EN
Distribution: D
Download and more information:
Decision text in EN (PDF, 410 KB)
Documentation of the appeal procedure can be found in the Register
Bibliographic information is available in: EN
Versions: Unpublished
Title of application: VIDEO QUALITY OBJECTIVE EVALUATION DEVICE, EVALUATION METHOD, AND PROGRAM
Applicant name: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
Opponent name: -
Board: 3.5.04
Headnote: -
Relevant legal provisions:
Rules of procedure of the Boards of Appeal Art 13
European Patent Convention 1973 Art 84
Keywords: Late-filed main request - admitted (yes)
Claims - clarity
Claims - main request and auxiliary requests (no)
Catchwords:

-

Cited decisions:
T 2480/11
Citing decisions:
-

Summary of Facts and Submissions

I. The appeal is against the decision of the examining division dated 25 February 2013 refusing European patent application No. 05793444.0, which was published as international application WO 2006/043500 A1.

II. The application was refused on the grounds that the invention was not disclosed in a manner sufficiently clear and complete for it to be carried out by a person skilled in the art (Article 83 EPC), and that the subject-matter of claim 1 of the main request and the first to third auxiliary requests extended beyond the content of the application as filed (Article 123(2) EPC).

III. The applicant filed notice of appeal. With the statement of grounds of appeal, the appellant filed amended claims 1 to 8 and requested that the decision under appeal be set aside and that a patent be granted on the basis of the claims filed with the statement of grounds of appeal (main request) or, alternatively, on the basis of the claims of the first, second or third auxiliary request which had formed the basis for the decision under appeal. The appellant provided arguments as to why the application met the requirements of Article 83 EPC and the subject-matter of the claims of all requests met the requirements of Article 123(2) EPC.

IV. The board issued a summons to oral proceedings. In a communication under Article 15(1) RPBA (Rules of Procedure of the Boards of Appeal, OJ EPO 2007, 536), annexed to the summons, the board gave its provisional opinion that the requirements of Articles 83 and 84 EPC 1973 were not met, and that the subject-matter of claim 1 of each of the requests extended beyond the content of the application as filed (Article 123(2) EPC).

V. With its reply dated 6 February 2019, the appellant filed amended claims of a main request and an auxiliary request and amended description pages 17, 18 and 19. It indicated a basis for the amendments and submitted arguments as to why the amended claims met the requirements of Articles 83 and 84 EPC 1973. The appellant also submitted the document "Recommendation ITU-R BT.500-13 (01/2012)", which disclosed methods for the subjective assessment of the quality of television pictures.

VI. On 7 March 2019, the board held oral proceedings.

The appellant was represented. It filed amended claims of a new main request, which replaced the claims of the previous main request.

The appellant's final requests were that the decision under appeal be set aside and that a European patent be granted on the basis of the claims according to the main request filed at the oral proceedings of 7 March 2019 or, in the alternative, on the basis of the claims according to the first auxiliary request filed by letter dated 6 February 2019, or according to one of the second to fourth auxiliary requests corresponding to the first to third auxiliary requests underlying the decision under appeal.

At the end of the oral proceedings, the chairman announced the decision.

VII. Claim 1 of the main request reads as follows:

"A video quality objective assessment device which estimates a subjective quality of a video, characterized by comprising:

a temporal/spatial feature amount derivation unit (12) which derives a temporal/spatial feature amount (PC) as a feature amount of deterioration which has occurred in a deteriorated video signal, using the deteriorated video signal (PI) and a reference video signal (RI) as a signal before deterioration of the deteriorated video signal; and

a subjective quality estimation unit (14) which estimates a subjective quality, Y, concerning the deteriorated video signal (PI) by weighting the temporal/spatial feature amount, PC, using first coefficients, a, gamma, preset by user's subjective assessment characteristics of a video, wherein the first coefficients a, gamma are obtained by determining a combination of optimal values so as to properly match the subjective assessment by the user with the objective assemement [sic] value Y by checking the subjective assessment characteristic of the user with respect to the video (PI) in which local video deterioration has occurred while chaning [sic] the deterioration amount;

said temporal/spatial feature amount derivation unit (12) comprising

first derivation means (121) for deriving the spatial feature amount, DS, of deterioration which has occurred in an assessment target frame of the deteriorated video signal (PI),

second derivation means (122) for deriving a temporal feature amount (C) of the deterioration which has occurred in the assessment target frame of the deteriorated video signal (PI), and

third derivation means (123) for deriving the f temporal/spatial feature amount, PC, using the spatial feature amount, DS, and the temporal feature amount, C),

wherein said first derivation means (121) calculates a deterioration amount (Si) of each block obtained by dividing the assessment target frame from the deteriorated video signal and the reference video signal, calculates, as a statistics, Xave_all, Xave_bad, of a spatial deterioration amount (S) distribution in the assessment target frame, a frame average deterioration amount, Xave_all, as a value obtained by averaging deterioration amounts over the entirety of the assessment target frame and a local deteriorated region average deterioration amount, Xave_bad, as a value obtained by averaging deterioration amounts belonging to a region of the assessment target frame in which deteriorations falling in a predetermined deterioration intensity range have occurred, and determines said spatial feature amount, DS, from second coefficients, A, B, preset by the user's subjective, assessment characteristic of the video and said statistics, Xave_all; Xave_bad, of the spatial deterioration amount distribution, where A is a coefficient obtained in advance by a subjective assessment characteristic when no local video deterioration has occured in space, and B is is [sic] a coefficient obtained in advance by a subjective assessment characteristic when local video deterioration has occurred in space, and;

said third derivation means (123) performs a function of using said spatial feature amount, DS, as the deterioration amount, C, to derive said temporal/spatial feature amount, PC, every measurement time (ut) based on the deterioration amount, C, in the presence of a localized video deterioration occurring on a time axis, an average deterioration amount, Dcons, in a steady state in the absence of the localized video deterioration occurring on the time axis, and the user's subjective assessment characteristic of the video,; and

said third derivation means (123) further determines, in said function, a localized video deterioration discrimination threshold (Fig. 12) on the basis of the average deterioration amount, Dcons, in a steady state calculated in an immediately preceding measurement time (ut) to thereby determine that the localized deterioration has occurred on the time axis when, in a current measurement time (ut), a difference between a deterioration amount, C, and the average deterioration amount, Dcons, in the steady state calculated in the immediately preceding measurement time is not smaller than the local deterioration discrimination threshold,

wherein a deterioration amount, C, at which the difference is not smaller than the local deterioration discrimination threshold is set as the deterioration amount, d, in the presence of the localized video deterioration."

VIII. Claim 1 of the first auxiliary request reads as follows:

"A video quality objective assessment device which estimates a subjective quality of a video, characterized by comprising:

a temporal/spatial feature amount derivation unit (12) which derives first and second temporal/spatial feature amounts (PC; X1, X2, ... , Xn) as feature amounts of deterioration which has occurred in a deteriorated video signal, using the deteriorated video signal (PI) and a reference video signal (RI) as a signal before deterioration of the deteriorated video signal; and

a subjective quality estimation unit (14) which estimates a subjective quality (Y) concerning the deteriorated video signal (RI) by weighting the first and second temporal/spatial feature amounts (PC; X1, X2, ... , Xn) using first coefficients (a, beta, gamma) preset by user's subjective assessment characteristics of a video,

said temporal/spatial feature amount derivation unit (12) comprising

first derivation means (121) for deriving a spatial feature amount (DS) of deterioration which has occurred in an assessment target frame of the deteriorated video signal (PI),

second derivation means (122) for deriving a temporal feature amount (C; frame rate; frame skip count; TI value) of the deterioration which has occurred in the assessment target frame of the deteriorated video signal, and

third derivation means (123) for deriving the first and second temporal/spatial feature amounts (PC; X1, X2, ... , Xn) using the spatial feature amount (DS) and the temporal feature amount (C; frame rate; frame skip count; TI value),

wherein said first derivation means (121) calculates a deterioration amount of each block obtained by dividing the assessment target frame from the deteriorated video signal and the reference video signal, calculates, as a statistics (Xave_all, Xave_bad) of a spatial deterioration amount (S) distribution in the assessment target frame, a frame average deterioration amount (Xave_all) as a value obtained by averaging deterioration amounts over the entirety of the assessment target frame and a local deteriorated region average deterioration amount (Xave_bad) as a value obtained by averaging deterioration amounts belonging to a region of the assessment target frame in which deteriorations falling in a predetermined deterioration intensity range have occurred, and determines said spatial feature amount (DS) from second coefficients (A, B) preset by the user's subjective assessment characteristic of the video and said statistics (Xave_all; Xave_bad) of the spatial deterioration amount distribution and;

said third derivation means (123) performs a first function of using said spatial feature amount (DS) as the deterioration amount to derive said first temporal/spatial feature amount every measurement time based on the deterioration amount in the presence of a localized video deterioration occurring on a time axis, an average deterioration amount (Dcons) in a steady state in the absence of the localized video deterioration occurring on the time axis, and the user's subjective assessment characteristic of the video, and a second function of using said temporal feature amount as the deterioration amount to derive said second temporal/spatial feature amount every measurement time based on the deterioration amount in the presence of the localized video deterioration occurring on the time axis, the average deterioration amount (Dcons) in the steady state in the absence of the localized video deterioration occurring on the time axis, and the user's subjective assessment characteristic of the video; and

said third derivation means (123) further determines, in each of said first and second functions, a localized video deterioration discrimination threshold on the basis of an average deterioration amount in a steady state calculated in an immediately preceding measurement time to thereby determine that the localized deterioration has occurred on the time axis when, in a current measurement time, a difference (d) between a deterioration amount and the average deterioration amount in the steady state calculated in the immediately preceding measurement time is not smaller than the local deterioration discrimination threshold, an average deterioration amount in the steady state obtained when the localized video deterioration on the time axis does not occur in the current measurement time, wherein a deterioration amount at which the difference (d) is not smaller than the local deterioration discrimination threshold is set as the deterioration amount in the presence of the localized video deterioration, and the average deterioration amount in the steady state is calculated by using an average value of deterioration amounts in the steady state obtained by removing time when the localized video deterioration is occurring from the current measurement time."

IX. Claim 1 of the second auxiliary request reads as follows:

"A video quality objective assessment device which estimates a subjective quality of a video, characterized by comprising:

a temporal and spatial feature amount derivation unit (12) which derives first and second temporal and spatial feature amounts (PC; X1, X2, ... , Xn) as feature amounts of deterioration which has occurred in a deteriorated video signal, using the deteriorated video signal (PI) and a reference video signal (RI) as a signal before deterioration of the deteriorated video signal; and

a subjective quality estimation unit (14) which estimates a subjective quality (Y) concerning the deteriorated video signal (RI) by weighting the first and second temporal and spatial feature amounts (PC; X1, X2, ... , Xn) using coefficients (a, beta, gamma) preset by subjective assessment characteristics of a video,

said temporal and spatial feature amount derivation unit (12) comprising

first derivation means (121) for deriving a spatial feature amount (DS) of deterioration which has occurred in an assessment target frame of the deteriorated video signal (PI),

second derivation means (122) for deriving a temporal feature amount (C; frame rate; frame skip count; TI value) of the deterioration which has occurred in the assessment target frame of the deteriorated video signal, and

third derivation means (123) for deriving the first and second temporal and spatial feature amounts (PC; X1, X2, ... , Xn) using the spatial feature amount (DS) and the temporal feature amount (C; frame rate; frame skip count; TI value),

wherein said first derivation means (121) calculates a deterioration amount of each block obtained by dividing the assessment target frame from the deteriorated video signal and the reference video signal, calculates, as a statistics (Xave_all, Xave_bad) of a spatial deterioration amount (S) distribution in the assessment target frame, a frame average deterioration amount (Xave_all) as a value obtained by averaging deterioration amounts over the entirety of the assessment target frame and a local deteriorated region average deterioration amount (Xave_bad) as a value obtained by averaging deterioration amounts belonging to a region of the assessment target frame in which deteriorations falling in a predetermined deterioration intensity range have occurred, and determines said spatial feature amount (DS) from the coefficients (A, B) preset by the user's subjective assessment characteristic of the video and said statistics (Xave_all; Xave_bad) of the spatial deterioration amount distribution and;

said third derivation means (123) performs a first function of using said spatial feature amount (DS) as the deterioration amount to derive said first temporal and spatial feature amount every measurement time based on the deterioration amount in the presence of a localized video deterioration occurring on a time axis, an average deterioration amount (Dcons) in a steady state in the absence of the localized video deterioration occurring on the time axis, and the user's subjective assessment characteristic of the video, and a second function of using said temporal feature amount as the deterioration amount to derive said second temporal and spatial feature amount every measurement time based on a deterioration amount in the presence of a localized video deterioration occurring on the time axis, an average deterioration amount in the steady state in the absence of the localized video deterioration on the time axis, and the user's subjective assessment characteristic of the video; and

said third derivation means (123) further determines, in each of said first and second functions, a localized deterioration discrimination threshold on the basis of an average deterioration amount in the steady state calculated in an immediately preceding measurement time to thereby determine that the localized deterioration has occurred on the time axis when a difference (d) between a deterioration amount in a current measurement time and the average deterioration amount in the steady state calculated in the immediately preceding measurement time is not smaller than the local deterioration discrimination threshold, an average deterioration amount in the steady state obtained when the localized video deterioration on the time axis does not occur in the current measurement time."

X. Claim 1 of the third auxiliary request differs from claim 1 of the second auxiliary request in that the last feature reads as follows:

"said third derivation means (123) further determines, in each of said first and second functions, a localized deterioration discrimination threshold on the basis of an average deterioration amount to thereby determine that the localized deterioration has occurred on the time axis when a difference (d) between a deterioration amount in a current measurement time and the average deterioration amount in the steady state calculated in the immediately preceding measurement time is not smaller than the local deterioration discrimination threshold, a degradation amount obtained when the difference is not less than the localized deterioration discrimination threshold is defined as deterioration amount obtained upon occurring the localized video deterioration on the time axis in the current measurement time, and an average deterioration amount in the steady state obtained when the localized video deterioration on the time axis does not occur in the current measurement time".

XI. Claim 1 of the fourth auxiliary request differs from claim 1 of the second auxiliary request in that the last feature reads as follows.

"said third derivation means (123) further determines, in each of said first and second functions, a localized deterioration discrimination threshold on the basis of an average deterioration amount in the steady state calculated in an immediately preceding measurement time to thereby determine that the localized deterioration has occurred on the time axis when a difference (d) between a deterioration amount in a current measurement time and the average deterioration amount in the steady state calculated in the immediately preceding measurement time is not smaller than the local deterioration discrimination threshold, a degradation amount obtained when the difference is not less than the localized deterioration discrimination threshold is defined as a deterioration amount obtained upon occurring the localized video deterioration on the time axis in the current measurement time, and an average deterioration amount in the steady state obtained when the localized video deterioration on the time axis does not occur in the current measurement time".

XII. The examining division's objections, where relevant to the present decision, may be summarised as follows.

The definitions of the steady-state average deterioration amount and the local video deterioration set out in the description on page 18, line 19, to page 19, line 10, did not allow a person skilled in the art to determine the steady-state average deterioration amount and the local video deterioration (see decision under appeal, point 11.2).

XIII. The appellant's arguments, where relevant to the present decision, may be summarised as follows.

(a) In response to the discussion during the oral proceedings, claim 1 of the main request was amended to exclude a video quality assessment based on the frame rate.

(b) The term "weighting" in claim 1 of the main request was to be understood as scaling the temporal/spatial feature amount with coefficients chosen to optimise the approximation of the subjective assessment.

(c) The steady-state average deterioration amount was more or less constant and the skilled person could generally speak of the steady-state average deterioration amount Dcons without explicitly referring to a specific measurement interval (see statement of grounds of appeal, pages 7 and 8). The average deterioration amount Dcons converged to an appropriate steady-state value as time passed (see statement of grounds of appeal, the paragraph bridging pages 9 and 10).

(d) Deterioration amounts C were calculated for each unit measurement interval (see description, page 17, lines 8 to 15), with "ut >= one frame interval" (see amended description, page 17, line 4 and letter dated 6 February 2019, page 6). The measurement interval ut was typically significantly larger than the frame interval (see letter dated 6 February 2019, page 7, first paragraph).

(e) The frame rate could be determined as the inverse of the time between two consecutive frames. Dcons was then defined as the average of these inverse values over the measurement interval.

(f) It was common in computer implemented inventions to derive and store, in advance, parameters used in subsequent processing. The derivation of these parameters need not be claimed. Although the user's subjective assessment characteristics were not defined in the claims they could be derived in advance from experiments using a subjective estimation method as described in, for example, ITU-R BT.500-13, page 12, section 4.5 (see letter dated 6 February 2019, page 3, first full paragraph).

Reasons for the Decision

1. The appeal is admissible.

2. Main request - admission into the appeal proceedings (Article 13 RPBA)

The main request was filed in response to the objections raised in respect of clarity during the oral proceedings before the board. Since the amendments were an attempt to simply exclude a video quality assessment on the basis of the frame rate (see point XIII(a) above), the board exercised its discretion under Article 13(1) RPBA and decided to admit the main request into the appeal proceedings.

3. Main request - clarity (Article 84 EPC 1973)

3.1 According to Article 84 EPC 1973, the claims "shall be clear".

Terms used in patent documents should be given their normal meaning in the relevant art, unless the description gives them a special meaning. The patent document may be its own dictionary provided that the description gives unambiguous definitions of these terms (see Case Law of the Boards of Appeal of the European Patent Office, 8th Edition, 2016, II.A.6.3.3 and T 2480/11, point 3.3.1 of the Reasons, with references to further decisions).

3.2 Claim 1 of the main request specifies "weighting the temporal/spatial feature amount, PC, using first coefficients, a, gamma, preset by user's subjective assessment characteristics of a video".

3.3 In the analysis set out below, references to the description relate to pages of the translation of the application as filed at the time of entering the regional phase, or corrected pages 17, 18 and 19 filed with the letter dated 6 February 2019.

3.4 In its normal meaning, the verb "to weight" implies that different components are multiplied by (different) factors, i.e. weights, reflecting the relative importance of the components. Hence, "to weight" amounts using coefficients implies that at least two amounts are multiplied by (different) coefficients.

3.5 This meaning of the verb "to weight" is also the one given in the description, see page 25, line 25, to page 26, line 19, which discloses that the objective assessment value is calculated by weighting a plurality of temporal/spatial feature amounts and adding an offset gamma.

3.6 The board is not convinced that "to weight" can be understood as defining a multiplication of one amount by a scaling factor and the addition of an offset to approximate the objective assessment (see point XIII(b) above). Such an interpretation would diverge from both the normal meaning of the verb and the meaning given in the description.

3.7 In view of the above, claim 1 of the main request does not meet the requirements of Article 84 EPC 1973.

4. First, second, third and fourth auxiliary requests - clarity (Article 84 EPC 1973)

4.1 The terms specified in claim 1 of each of the auxiliary requests include: the deterioration amount in the presence of a localised video deterioration occurring on a time axis; an average deterioration amount in a steady state; and the user's subjective assessment characteristic of the video.

4.2 These terms do not have a well-defined meaning in the technical field of the present application, nor are they defined in the claims.

4.3 The board agrees with the examining division's assessment that the definition of the steady-state average deterioration amount set out in the description on page 18, line 19, to page 19, line 10, does not allow a person skilled in the art to determine the steady-state average deterioration amount referred to in claim 1 (see section XII above), i.e. to determine which values are averaged over which time period to determine the steady-state average Dcons.

4.3.1 The appellant's assertion that the steady-state average deterioration amount is more or less constant and that the skilled person can generally speak of the steady-state average deterioration amount Dcons without explicitly referring to a specific and single measurement interval ut (see point XIII(c) above) is not based on the disclosure of the application.

The amount Dcons "is the average value of the deterioration amounts C in a steady-state period obtained by removing a local video deterioration occurrence period from the unit measurement interval ut, and is calculated for each unit measurement interval ut" (see description, page 17, lines 8 to 15 and page 18, lines 19 to 24), with the unit measurement interval "ut >= one frame interval" (see point XIII(d) above).

(a) The condition "ut >= one frame interval" encompasses the case that ut equals one frame interval. This is exemplified by the definition of DS in equation (1) on page 14.

According to the paragraph bridging pages 17 and 18 of the description, the frame rate or the spatial feature amount DS can be used as the deterioration amount C. Since one value C (DS) is calculated for a unit measurement interval (frame), and Dcons is calculated for a unit measurement interval (frame), it is not clear which "amounts C" are averaged or how C can vary within a unit measurement interval (as depicted in Figures 9, 10 and 11).

Furthermore, the frame rate is not suitable for detecting a local deterioration, because it does not vary within the unit measurement period ut.

The appellant's argument that Dcons converges to an appropriate steady-state value as time passes (see point XIII(c) above) appears to be based on the assumption that Dcons is derived by averaging the value C for all previous unit measurement intervals. However, the application does not define the "steady-state period". Rather, Figure 11 gives the impression that Dcons is calculated as an average within a measurement interval ut but excluding duration t.

(b) The condition "ut >= one frame interval" also encompasses the case that ut is longer than one frame interval (see point XIII(d) above). If plural "amounts C" are calculated and stored for the interval ut then these values can be used to determine an average Dcons over the interval.

However, the frame rate is defined by the number of frames within a given period. If the frame rate is defined as the number of frames within the period ut, it is not clear how multiple values for the frame rate are generated and stored to calculate the average Dcons. The description does not disclose any specific method for calculating the frame rate. In particular, it does not hint at calculating the inverse of the time difference between two frames to determine multiple values C within the measurement period (see point XIII(e) above).

4.3.2 The claims define neither the user's subjective characteristics nor how the coefficients are "preset" by these characteristics. The description does not provide a definition of the subjective characteristics either, nor does it mention the ITU-R recommendation submitted by the appellant (see point XIII(f) above). In particular, it does not disclose a link between the quality assessed according to the recommendation and the user's subjective assessment characteristics.

4.3.3 Summarising, the description does not give an unambiguous definition of the vague terms specified in claim 1, either.

4.4 In view of the above, the requirements of Article 84 EPC 1973 are not met by claim 1 of any of the auxiliary requests.

5. Since none of the appellant's requests is allowable, the appeal is to be dismissed.

Order

For these reasons it is decided that:

The appeal is dismissed.

Quick Navigation